PowerPlantAccessor#
- class powerplantmatching.accessor.PowerPlantAccessor(pandas_obj)#
Bases:
object
Accessor object for DataFrames created with powerplantmatching. This simplifies the access to common functions applicable to dataframes with powerplant data. Note even though this is a general DataFrame accessor, the functions will only work for powerplantmatching related DataFrames.
Examples
import powerplantmatching as pm entsoe = pm.data.ENTSOE() entsoe.powerplant.plot_aggregated()
Methods Summary
aggregate_units
([dataset_name, ...])Vertical cleaning of the database.
Function to inspect grouped and matched entries of a matched dataframe.
clean_powerplantname
([config])extend_by_VRE
([config, base_year, prune_beyond])Extends a given reduced dataframe by externally given VREs.
extend_by_non_matched
(extend_by[, label, ...])Returns the matched dataframe with additional entries of non-matched powerplants of a reliable source.
fill_geoposition
([use_saved_locations, ...])Fill missing 'lat' and 'lon' values.
Fills the empty commissioning years with averages.
fill_missing_decommissioning_years
([config])Function which sets/fills a column 'DateOut' with roughly estimated values for decommissioning years, based on the estimated lifetimes per Fueltype given in the config and corresponding commissioning years.
fill_missing_decommyears
([config])get_name
()isin
(matched[, label])Checks if a given dataframe is included in a matched dataframe.
lookup
([keys, by, exclude, unit])Returns a lookup table of the dataframe df with rounded numbers.
map_bus
(buses)Assign a 'bus' column to the dataframe based on a list of coordinates.
map_country_bus
(buses)Assign a 'bus' column based on a list of coordinates and countries.
match_with
(df[, labels, config, reduced])plot_aggregated
([by, figsize])Plotting function for fast inspection of the capacity distribution.
plot_map
([scale, alpha, european_bounds, ...])reduce_matched_dataframe
([show_orig_names, ...])Reduce a matched dataframe to a unique set of columns.
rescale_capacities_to_country_totals
([fueltypes])Returns a extra column 'Scaled Capacity' with an up or down scaled capacity in order to match the statistics of the ENTSOe country totals.
scale_to_net_capacities
([is_gross, catch_all])select_by_projectID
(projectID[, dataset_name])Convenience function to select data by its projectID
set_name
(name)Replace uncommon fueltype specifications as by 'Other'.
Rename the columns of the powerplant data according to the convention in PyPSA.
Methods Documentation
- aggregate_units(dataset_name=None, pre_clean_name=False, country_wise=True, config=None, **kwargs)#
Vertical cleaning of the database. Cleans the “Name”-column, sums up the capacity of powerplant units which are determined to belong to the same plant.
- Parameters:
df (pandas.Dataframe or string) – Dataframe or name to use for the resulting database
dataset_name (str, default None) – Specify the name of your df, required if use_saved_aggregation is set to True.
pre_clean_name (Boolean, default True) – Whether to clean the ‘Name’-column before aggregating.
country_wise (Boolean, default True) – Whether to aggregate only entries with a identical country.
- breakdown_matches()#
Function to inspect grouped and matched entries of a matched dataframe. Breaks down to all ingoing data on detailed level.
- Parameters:
df (pd.DataFrame) – Matched data with not empty projectID-column. Keys of projectID must be specified in powerplantmatching.data.data_config
- clean_powerplantname(config=None)#
Deprecated since version 5.0: This will be removed in 0.6. Use clean_name instead.
- convert_alpha2_to_country()#
- convert_country_to_alpha2()#
- convert_to_short_name()#
- extend_by_VRE(config=None, base_year=2017, prune_beyond=True)#
Extends a given reduced dataframe by externally given VREs.
- Parameters:
df (pandas.DataFrame) – The dataframe to be extended
base_year (int) – Needed for deriving cohorts from IRENA’s capacity statistics
- Returns:
df – Extended dataframe
- Return type:
pd.DataFrame
- extend_by_non_matched(extend_by, label=None, query=None, aggregate_added_data=True, config=None, **aggkwargs)#
Returns the matched dataframe with additional entries of non-matched powerplants of a reliable source.
- Parameters:
df (Pandas.DataFrame) – Already matched dataset which should be extended
extend_by (pd.DataFrame | str) – Database which is partially included in the matched dataset, but which should be included totally. If str is passed, is will be used to call the corresponding data from data.py
label (str) – Column name of the additional database within the matched dataset, this string is used if the columns of the additional database do not correspond to the ones of the dataset
- fill_geoposition(use_saved_locations=True, saved_only=True, config=None)#
Fill missing ‘lat’ and ‘lon’ values. Uses geoparsing with the value given in ‘Name’, limits the search through value in ‘Country’. df must contain ‘Name’, ‘lat’, ‘lon’ and ‘Country’ as columns.
- Parameters:
df (pandas.DataFrame) – DataFrame of power plants
use_saved_position (Boolean, default True) – Whether to firstly compare with cached results in powerplantmatching/data/parsed_locations.csv
saved_only (Boolean, default True) – Whether to only add geo-positions which are stored at pm.core._package_data(“parsed_locations.csv”)
- fill_missing_commissioning_years()#
Fills the empty commissioning years with averages.
- fill_missing_commyears()#
Deprecated since version 0.5.0: This will be removed in 0.6.0. This function was renamed to fill_missing_commissioning_years
- fill_missing_decommissioning_years(config=None)#
Function which sets/fills a column ‘DateOut’ with roughly estimated values for decommissioning years, based on the estimated lifetimes per Fueltype given in the config and corresponding commissioning years. Note that the latter is filled up using fill_missing_commissioning_years.
- fill_missing_decommyears(config=None)#
Deprecated since version 0.5.0: This will be removed in 0.6.0. This function was renamed to fill_missing_decommissioning_years
- fill_missing_duration()#
- get_name()#
- isin(matched, label=None)#
Checks if a given dataframe is included in a matched dataframe.
- Parameters:
df (pd.DataFrame) – The dataframe to be checked
matched (pd.DataFrame) – The matched dataframe
- Returns:
True if all dataframes are included in the matched dataframe, False otherwise
- Return type:
bool
- lookup(keys=None, by='Country, Fueltype', exclude=None, unit='MW')#
Returns a lookup table of the dataframe df with rounded numbers. Use different lookups as “Country”, “Fueltype” for the different lookups.
- Parameters:
df (pandas.Dataframe or list of pandas.Dataframe's) – powerplant databases to be analysed. If multiple dataframes are passed the lookup table will display them in a MulitIndex
by (string out of 'Country, Fueltype', 'Country' or 'Fueltype') – Define the type of lookup table you want to obtain.
keys (list of strings) – labels of the different datasets, only necessary if multiple dataframes passed
exclude (list) – list of fueltype to exclude from the analysis
- map_bus(buses)#
Assign a ‘bus’ column to the dataframe based on a list of coordinates.
- Parameters:
df (pd.DataFrame) – power plant list with coordinates ‘lat’ and ‘lon’
buses (pd.DataFrame) – bus list with coordites ‘x’ and ‘y’
- Return type:
DataFrame with an extra column ‘bus’ indicating the nearest bus.
- map_country_bus(buses)#
Assign a ‘bus’ column based on a list of coordinates and countries.
- Parameters:
df (pd.DataFrame) – power plant list with coordinates ‘lat’, ‘lon’ and ‘Country’
buses (pd.DataFrame) – bus list with coordites ‘x’, ‘y’, ‘country’
- Return type:
DataFrame with an extra column ‘bus’ indicating the nearest bus.
- match_with(df, labels=None, config=None, reduced=True, **dukeargs)#
- plot_aggregated(by=['Country', 'Fueltype'], figsize=(12, 20), **kwargs)#
Plotting function for fast inspection of the capacity distribution. Returns figure and axes of a matplotlib barplot.
- Parameters:
by (list, default ['Country', 'Fueltype']) – Define the columns of the dataframe to be grouped on.
figsize (tuple, default (12,20)) – width and height of the figure
**kwargs – keywordargument for matplotlib plotting
- plot_map(scale=20.0, alpha=0.6, european_bounds=True, fillcontinents=False, legendscale=1, resolution=True, figsize=None, ncol=2, loc='upper left')#
- reduce_matched_dataframe(show_orig_names=False, config=None)#
Reduce a matched dataframe to a unique set of columns. For each entry take the value of the most reliable data source included in that match.
- Parameters:
df (pandas.Dataframe) – MultiIndex dataframe with the matched powerplants, as obtained from combined_dataframe() or match_multiple_datasets()
- rescale_capacities_to_country_totals(fueltypes=None)#
Returns a extra column ‘Scaled Capacity’ with an up or down scaled capacity in order to match the statistics of the ENTSOe country totals. For every country the information about the total capacity of each fueltype is given. The scaling factor is determined by the ratio of the aggregated capacity of the fueltype within each country and the ENTSOe statistics about the fueltype capacity total within each country.
- Parameters:
df (Pandas.DataFrame) – Data set that should be modified
fueltype (str or list of strings) – fueltype that should be scaled
- scale_to_net_capacities(is_gross=True, catch_all=True)#
- select_by_projectID(projectID, dataset_name=None)#
Convenience function to select data by its projectID
- set_name(name)#
- set_uncommon_fueltypes_to_other(fillna_other=True, config=None, **kwargs)#
Replace uncommon fueltype specifications as by ‘Other’. This helps to compare datasources with Capacity statistics given by powerplantmatching.data.Capacity_stats().
- Parameters:
df (pd.DataFrame) – DataFrame to replace ‘Fueltype’ argument
fillna_other (Boolean, default True) – Whether to replace NaN values in ‘Fueltype’ with ‘Other’
fueltypes (list) – list of replaced fueltypes, defaults to [‘Mixed fuel types’, ‘Electro-mechanical’, ‘Hydrogen Storage’]
- to_pypsa_names()#
Rename the columns of the powerplant data according to the convention in PyPSA.
- Parameters:
data (df {pandas.DataFrame} -- powerplant)
- Returns:
pandas.DataFrame – Column renamed dataframe