PowerPlantAccessor#

class powerplantmatching.accessor.PowerPlantAccessor(pandas_obj)#

Bases: object

Accessor object for DataFrames created with powerplantmatching. This simplifies the access to common functions applicable to dataframes with powerplant data. Note even though this is a general DataFrame accessor, the functions will only work for powerplantmatching related DataFrames.

Examples

import powerplantmatching as pm entsoe = pm.data.ENTSOE() entsoe.powerplant.plot_aggregated()

Methods Summary

aggregate_units([dataset_name, ...])

Vertical cleaning of the database.

breakdown_matches()

Function to inspect grouped and matched entries of a matched dataframe.

clean_powerplantname([config])

Deprecated since version 5.0.

convert_alpha2_to_country()

convert_country_to_alpha2()

convert_to_short_name()

extend_by_VRE([config, base_year, prune_beyond])

Extends a given reduced dataframe by externally given VREs.

extend_by_non_matched(extend_by[, label, ...])

Returns the matched dataframe with additional entries of non-matched powerplants of a reliable source.

fill_geoposition([use_saved_locations, ...])

Fill missing 'lat' and 'lon' values.

fill_missing_commissioning_years()

Fills the empty commissioning years with averages.

fill_missing_commyears()

Deprecated since version 0.5.0.

fill_missing_decommissioning_years([config])

Function which sets/fills a column 'DateOut' with roughly estimated values for decommissioning years, based on the estimated lifetimes per Fueltype given in the config and corresponding commissioning years.

fill_missing_decommyears([config])

Deprecated since version 0.5.0.

fill_missing_duration()

get_name()

isin(matched[, label])

Checks if a given dataframe is included in a matched dataframe.

lookup([keys, by, exclude, unit])

Returns a lookup table of the dataframe df with rounded numbers.

map_bus(buses)

Assign a 'bus' column to the dataframe based on a list of coordinates.

map_country_bus(buses)

Assign a 'bus' column based on a list of coordinates and countries.

match_with(df[, labels, config, reduced])

plot_aggregated([by, figsize])

Plotting function for fast inspection of the capacity distribution.

plot_map([scale, alpha, european_bounds, ...])

reduce_matched_dataframe([show_orig_names, ...])

Reduce a matched dataframe to a unique set of columns.

rescale_capacities_to_country_totals([fueltypes])

Returns a extra column 'Scaled Capacity' with an up or down scaled capacity in order to match the statistics of the ENTSOe country totals.

scale_to_net_capacities([is_gross, catch_all])

select_by_projectID(projectID[, dataset_name])

Convenience function to select data by its projectID

set_name(name)

set_uncommon_fueltypes_to_other([...])

Replace uncommon fueltype specifications as by 'Other'.

to_pypsa_names()

Rename the columns of the powerplant data according to the convention in PyPSA.

Methods Documentation

aggregate_units(dataset_name=None, pre_clean_name=False, country_wise=True, config=None, **kwargs)#

Vertical cleaning of the database. Cleans the “Name”-column, sums up the capacity of powerplant units which are determined to belong to the same plant.

Parameters:
  • df (pandas.Dataframe or string) – Dataframe or name to use for the resulting database

  • dataset_name (str, default None) – Specify the name of your df, required if use_saved_aggregation is set to True.

  • pre_clean_name (Boolean, default True) – Whether to clean the ‘Name’-column before aggregating.

  • country_wise (Boolean, default True) – Whether to aggregate only entries with a identical country.

breakdown_matches()#

Function to inspect grouped and matched entries of a matched dataframe. Breaks down to all ingoing data on detailed level.

Parameters:

df (pd.DataFrame) – Matched data with not empty projectID-column. Keys of projectID must be specified in powerplantmatching.data.data_config

clean_powerplantname(config=None)#

Deprecated since version 5.0: This will be removed in 0.6. Use clean_name instead.

convert_alpha2_to_country()#
convert_country_to_alpha2()#
convert_to_short_name()#
extend_by_VRE(config=None, base_year=2017, prune_beyond=True)#

Extends a given reduced dataframe by externally given VREs.

Parameters:
  • df (pandas.DataFrame) – The dataframe to be extended

  • base_year (int) – Needed for deriving cohorts from IRENA’s capacity statistics

Returns:

df – Extended dataframe

Return type:

pd.DataFrame

extend_by_non_matched(extend_by, label=None, query=None, aggregate_added_data=True, config=None, **aggkwargs)#

Returns the matched dataframe with additional entries of non-matched powerplants of a reliable source.

Parameters:
  • df (Pandas.DataFrame) – Already matched dataset which should be extended

  • extend_by (pd.DataFrame | str) – Database which is partially included in the matched dataset, but which should be included totally. If str is passed, is will be used to call the corresponding data from data.py

  • label (str) – Column name of the additional database within the matched dataset, this string is used if the columns of the additional database do not correspond to the ones of the dataset

fill_geoposition(use_saved_locations=True, saved_only=True, config=None)#

Fill missing ‘lat’ and ‘lon’ values. Uses geoparsing with the value given in ‘Name’, limits the search through value in ‘Country’. df must contain ‘Name’, ‘lat’, ‘lon’ and ‘Country’ as columns.

Parameters:
  • df (pandas.DataFrame) – DataFrame of power plants

  • use_saved_position (Boolean, default True) – Whether to firstly compare with cached results in powerplantmatching/data/parsed_locations.csv

  • saved_only (Boolean, default True) – Whether to only add geo-positions which are stored at pm.core._package_data(“parsed_locations.csv”)

fill_missing_commissioning_years()#

Fills the empty commissioning years with averages.

fill_missing_commyears()#

Deprecated since version 0.5.0: This will be removed in 0.6.0. This function was renamed to fill_missing_commissioning_years

fill_missing_decommissioning_years(config=None)#

Function which sets/fills a column ‘DateOut’ with roughly estimated values for decommissioning years, based on the estimated lifetimes per Fueltype given in the config and corresponding commissioning years. Note that the latter is filled up using fill_missing_commissioning_years.

fill_missing_decommyears(config=None)#

Deprecated since version 0.5.0: This will be removed in 0.6.0. This function was renamed to fill_missing_decommissioning_years

fill_missing_duration()#
get_name()#
isin(matched, label=None)#

Checks if a given dataframe is included in a matched dataframe.

Parameters:
  • df (pd.DataFrame) – The dataframe to be checked

  • matched (pd.DataFrame) – The matched dataframe

Returns:

True if all dataframes are included in the matched dataframe, False otherwise

Return type:

bool

lookup(keys=None, by='Country, Fueltype', exclude=None, unit='MW')#

Returns a lookup table of the dataframe df with rounded numbers. Use different lookups as “Country”, “Fueltype” for the different lookups.

Parameters:
  • df (pandas.Dataframe or list of pandas.Dataframe's) – powerplant databases to be analysed. If multiple dataframes are passed the lookup table will display them in a MulitIndex

  • by (string out of 'Country, Fueltype', 'Country' or 'Fueltype') – Define the type of lookup table you want to obtain.

  • keys (list of strings) – labels of the different datasets, only necessary if multiple dataframes passed

  • exclude (list) – list of fueltype to exclude from the analysis

map_bus(buses)#

Assign a ‘bus’ column to the dataframe based on a list of coordinates.

Parameters:
  • df (pd.DataFrame) – power plant list with coordinates ‘lat’ and ‘lon’

  • buses (pd.DataFrame) – bus list with coordites ‘x’ and ‘y’

Return type:

DataFrame with an extra column ‘bus’ indicating the nearest bus.

map_country_bus(buses)#

Assign a ‘bus’ column based on a list of coordinates and countries.

Parameters:
  • df (pd.DataFrame) – power plant list with coordinates ‘lat’, ‘lon’ and ‘Country’

  • buses (pd.DataFrame) – bus list with coordites ‘x’, ‘y’, ‘country’

Return type:

DataFrame with an extra column ‘bus’ indicating the nearest bus.

match_with(df, labels=None, config=None, reduced=True, **dukeargs)#
plot_aggregated(by=['Country', 'Fueltype'], figsize=(12, 20), **kwargs)#

Plotting function for fast inspection of the capacity distribution. Returns figure and axes of a matplotlib barplot.

Parameters:
  • by (list, default ['Country', 'Fueltype']) – Define the columns of the dataframe to be grouped on.

  • figsize (tuple, default (12,20)) – width and height of the figure

  • **kwargs – keywordargument for matplotlib plotting

plot_map(scale=10.0, alpha=0.6, european_bounds=True, fillcontinents=False, legendscale=1, resolution=True, figsize=None, ncol=2, loc='upper left')#
reduce_matched_dataframe(show_orig_names=False, config=None)#

Reduce a matched dataframe to a unique set of columns. For each entry take the value of the most reliable data source included in that match.

Parameters:

df (pandas.Dataframe) – MultiIndex dataframe with the matched powerplants, as obtained from combined_dataframe() or match_multiple_datasets()

rescale_capacities_to_country_totals(fueltypes=None)#

Returns a extra column ‘Scaled Capacity’ with an up or down scaled capacity in order to match the statistics of the ENTSOe country totals. For every country the information about the total capacity of each fueltype is given. The scaling factor is determined by the ratio of the aggregated capacity of the fueltype within each country and the ENTSOe statistics about the fueltype capacity total within each country.

Parameters:
  • df (Pandas.DataFrame) – Data set that should be modified

  • fueltype (str or list of strings) – fueltype that should be scaled

scale_to_net_capacities(is_gross=True, catch_all=True)#
select_by_projectID(projectID, dataset_name=None)#

Convenience function to select data by its projectID

set_name(name)#
set_uncommon_fueltypes_to_other(fillna_other=True, config=None, **kwargs)#

Replace uncommon fueltype specifications as by ‘Other’. This helps to compare datasources with Capacity statistics given by powerplantmatching.data.Capacity_stats().

Parameters:
  • df (pd.DataFrame) – DataFrame to replace ‘Fueltype’ argument

  • fillna_other (Boolean, default True) – Whether to replace NaN values in ‘Fueltype’ with ‘Other’

  • fueltypes (list) – list of replaced fueltypes, defaults to [‘Bioenergy’, ‘Geothermal’, ‘Mixed fuel types’, ‘Electro-mechanical’, ‘Hydrogen Storage’]

to_pypsa_names()#

Rename the columns of the powerplant data according to the convention in PyPSA.

Parameters:

data (df {pandas.DataFrame} -- powerplant) –

Returns:

pandas.DataFrame – Column renamed dataframe