Utility Modules¶

utils ¶

Utility functions for checking data completeness and supporting other functions

Functions:

breakdown_matches –

Function to inspect grouped and matched entries of a matched
config_filter –

Convenience function to filter data source according to the config.yaml
convert_alpha2_to_country –
convert_country_to_alpha2 –
convert_to_short_name –
correct_manually –

Update powerplant data based on stored corrections in
country_alpha2 –

Convenience function for converting country name into alpha 2 codes
fill_geoposition –

Fill missing 'lat' and 'lon' values. Uses geoparsing with the value given
fun –

Helper function for multiprocessing in classes/functions
get_name –

Helper function to associate dataframe with a name. This is done with the
get_raw_file –
lookup –

Returns a lookup table of the dataframe df with rounded numbers.
parmap –

Parallel mapping function. Use this function to parallelly map function
parse_Geoposition –

Nominatim request for the Geoposition of a specific location in a country.
parse_string_to_dict –

Convenience function to convert string of dict to dict type for specified columns.
read_csv_if_string –

Convenience function to import powerplant data source if a string is given.
restore_blocks –

Restore blocks of powerplants from a matched dataframe.
select_by_projectID –

Convenience function to select data by its projectID
set_column_name –

Helper function to associate dataframe with a name. This is done with the
set_uncommon_fueltypes_to_other –

Replace uncommon fueltype specifications as by 'Other'. This helps to
to_categorical_columns –

Helper function to set datatype of columns 'Fueltype', 'Country', 'Set',
to_dict_if_string –

Convenience function to ensure dict-like output
to_list_if_other –

Convenience function to ensure list-like output
update_saved_matches_for_ –

Update your saved matched for a single source. This is very helpful if you

Attributes:

cc –
country_map –

cc `module-attribute` ¶

cc = coco.CountryConverter()

country_map `module-attribute` ¶

country_map = pd.read_csv(_package_data('country_codes.csv')).replace({'name': {'Czechia': 'Czech Republic'}})

breakdown_matches ¶

breakdown_matches(df)

Function to inspect grouped and matched entries of a matched dataframe. Breaks down to all ingoing data on detailed level.

Parameters:

df (DataFrame) –

Matched data with not empty projectID-column. Keys of projectID must be specified in powerplantmatching.data.data_config

config_filter ¶

config_filter(df, config)

Convenience function to filter data source according to the config.yaml file. Individual query filters are applied if argument 'name' is given.

Parameters:

df (DataFrame) –

Data to be filtered
name (str, default: None ) –

Name of the data source to identify query in the config.yaml file
config (dict, default: None ) –

Configuration overrides varying from the config.yaml file

convert_alpha2_to_country ¶

convert_alpha2_to_country(df)

convert_country_to_alpha2 ¶

convert_country_to_alpha2(df)

convert_to_short_name ¶

convert_to_short_name(df)

correct_manually ¶

correct_manually(df, name, config=None)

Update powerplant data based on stored corrections in powerplantmatching/data/in/manual_corrections.csv. Specify the name of the data by the second argument.

Parameters:

df (DataFrame) –

Powerplant data
name (str) –

Name of the data source, should be in columns of manual_corrections.csv

country_alpha2 ¶

country_alpha2(country)

Convenience function for converting country name into alpha 2 codes

fill_geoposition ¶

fill_geoposition(df, use_saved_locations=True, saved_only=True, config=None)

Fill missing 'lat' and 'lon' values. Uses geoparsing with the value given in 'Name', limits the search through value in 'Country'. df must contain 'Name', 'lat', 'lon' and 'Country' as columns.

Parameters:

df (DataFrame) –

DataFrame of power plants
use_saved_position (Boolean, default: True ) –

Whether to firstly compare with cached results in powerplantmatching/data/parsed_locations.csv
saved_only –

Whether to only add geo-positions which are stored at pm.core._package_data("parsed_locations.csv")

fun ¶

fun(f, q_in, q_out)

Helper function for multiprocessing in classes/functions

get_name ¶

get_name(df)

Helper function to associate dataframe with a name. This is done with the columns-axis name, as pd.DataFrame do not have a name attribute.

get_raw_file ¶

get_raw_file(name, update=False, config=None, skip_retrieve=False)

lookup ¶

lookup(df, keys=None, by='Country, Fueltype', exclude=None, unit='MW')

Returns a lookup table of the dataframe df with rounded numbers. Use different lookups as "Country", "Fueltype" for the different lookups.

Parameters:

df (pandas.Dataframe or list of pandas.Dataframe's) –

powerplant databases to be analysed. If multiple dataframes are passed the lookup table will display them in a MulitIndex
by (string out of 'Country, Fueltype', 'Country' or 'Fueltype', default: 'Country, Fueltype' ) –

Define the type of lookup table you want to obtain.
keys (list of strings, default: None ) –

labels of the different datasets, only necessary if multiple dataframes passed
exclude –

list of fueltype to exclude from the analysis

parmap ¶

parmap(f, arg_list, config=None, threads=None)

Parallel mapping function. Use this function to parallelly map function f onto arguments in arg_list. The maximum number of parallel threads is taken from config.yaml:parallel_duke_processes.

Parameters:

f (function) –

python function with one argument
arg_list (list) –

list of arguments mapped to f
config (dict, default: None ) –

configuration dictionary
threads (int, default: None ) –

number of parallel threads

parse_Geoposition ¶

parse_Geoposition(location, zipcode='', country='', use_saved_locations=False, saved_only=False)

Nominatim request for the Geoposition of a specific location in a country. Returns a tuples with (latitude, longitude, country) if the request was successful, returns np.nan otherwise.

ToDo: There exist further online sources for lat/long data which could be used, if this one fails, e.g. - Google Geocoding API - Yahoo! Placefinder - https://askgeo.com (??)

Parameters:

location (string) –

description of the location, can be city, area etc.
country (string, default: '' ) –

name of the country which will be used as a bounding area
use_saved_postion (Boolean, default: False ) –

Whether to firstly compare with cached results in powerplantmatching/data/parsed_locations.csv

parse_string_to_dict ¶

parse_string_to_dict(df, cols)

Convenience function to convert string of dict to dict type for specified columns.

Parameters:

df (DataFrame) –

DataFrame on which to apply the parsing
cols ((str, list)) –

Column(s) to be parsed to dict type

Returns:

DataFrame –

DataFrame with specified columns parsed to dict type

read_csv_if_string ¶

read_csv_if_string(df)

Convenience function to import powerplant data source if a string is given.

restore_blocks ¶

restore_blocks(df, mode=2, config=None)

Restore blocks of powerplants from a matched dataframe.

This function breaks down all matches. For each match separately it selects blocks from only one input data source. For this selection the following modi are available:

1. Select the source with most number of blocks in the match

2. Select the source with the highest reliability score

Parameters:

df (DataFrame) –

Matched data with not empty projectID-column. Keys of projectID must be specified in powerplantmatching.data.data_config

select_by_projectID ¶

select_by_projectID(df, projectID, dataset_name=None)

Convenience function to select data by its projectID

set_column_name ¶

set_column_name(df, name)

Helper function to associate dataframe with a name. This is done with the columns-axis name, as pd.DataFrame do not have a name attribute.

set_uncommon_fueltypes_to_other ¶

set_uncommon_fueltypes_to_other(df, fillna_other=True, config=None, **kwargs)

Replace uncommon fueltype specifications as by 'Other'. This helps to compare datasources with Capacity statistics given by powerplantmatching.data.Capacity_stats().

Parameters:

df (DataFrame) –

DataFrame to replace 'Fueltype' argument
fillna_other (Boolean, default: True ) –

Whether to replace NaN values in 'Fueltype' with 'Other'
fueltypes (list) –

list of replaced fueltypes, defaults to ['Mixed fuel types', 'Electro-mechanical', 'Hydrogen Storage']

to_categorical_columns ¶

to_categorical_columns(df)

Helper function to set datatype of columns 'Fueltype', 'Country', 'Set', 'File', 'Technology' to categorical.

to_dict_if_string ¶

to_dict_if_string(s)

Convenience function to ensure dict-like output

to_list_if_other ¶

to_list_if_other(obj)

Convenience function to ensure list-like output

update_saved_matches_for_ ¶

update_saved_matches_for_(name)

Update your saved matched for a single source. This is very helpful if you modified/updated a data source and do not want to run the whole matching again.

Example

Assume data source 'ESE' changed a little:

pm.utils.update_saved_matches_for_('ESE') ... ... pm.collection.matched_data(update=True)

Now the matched_data is updated with the modified version of ESE.

export ¶

Functions:

fueltype_to_abbrev –

Return the fueltype-specific abbreviation.
map_bus –

Assign a 'bus' column to the dataframe based on a list of coordinates.
map_country_bus –

Assign a 'bus' column based on a list of coordinates and countries.
store_open_dataset –
timestype_to_life –

Returns the timestype-specific technical lifetime.
to_TIMES –

Transform a given dataset into the TIMES format and export as .xlsx.
to_pypsa_names –

Rename the columns of the powerplant data according to the
to_pypsa_network –

Export a powerplant dataframe to a pypsa.Network(), specify specific buses

Attributes:

cget –

cget `module-attribute` ¶

cget = pycountry.countries.get

fueltype_to_abbrev ¶

fueltype_to_abbrev()

Return the fueltype-specific abbreviation.

map_bus ¶

map_bus(df, buses)

Assign a 'bus' column to the dataframe based on a list of coordinates.

Parameters:

df (DataFrame) –

power plant list with coordinates 'lat' and 'lon'
buses (DataFrame) –

bus list with coordites 'x' and 'y'

Returns:

DataFrame with an extra column 'bus' indicating the nearest bus. –

map_country_bus ¶

map_country_bus(df, buses)

Assign a 'bus' column based on a list of coordinates and countries.

Parameters:

df (DataFrame) –

power plant list with coordinates 'lat', 'lon' and 'Country'
buses (DataFrame) –

bus list with coordites 'x', 'y', 'country'

Returns:

DataFrame with an extra column 'bus' indicating the nearest bus. –

store_open_dataset ¶

store_open_dataset()

timestype_to_life ¶

timestype_to_life()

Returns the timestype-specific technical lifetime.

to_TIMES ¶

to_TIMES(df=None, use_scaled_capacity=False, baseyear=2015)

Transform a given dataset into the TIMES format and export as .xlsx.

to_pypsa_names ¶

to_pypsa_names(df)

Rename the columns of the powerplant data according to the convention in PyPSA.

Arguments: df {pandas.DataFrame} -- powerplant data

Returns: pandas.DataFrame -- Column renamed dataframe

to_pypsa_network ¶

to_pypsa_network(df, network, buslist=None)

Export a powerplant dataframe to a pypsa.Network(), specify specific buses to allocate the plants (buslist).

heuristics ¶

Functions to modify and adjust power plant datasets

Functions:

PLZ_to_LatLon_map –
aggregate_VRE_by_commissioning_year –

Aggregate the vast number of VRE (e.g. vom data.OPSD_VRE()) units to one
aggregate_VRE_by_commyear –
derive_vintage_cohorts_from_statistics –

This function assumes an age-distribution for given capacity statistics
extend_by_VRE –

Extends a given reduced dataframe by externally given VREs.
extend_by_non_matched –

Returns the matched dataframe with additional entries of non-matched
fill_missing_commissioning_years –

Fills the empty commissioning years with averages.
fill_missing_commyears –
fill_missing_decommissioning_years –

Function which sets/fills a column 'DateOut' with roughly
fill_missing_decommyears –
fill_missing_duration –
gross_to_net_factors –
isin –

Checks if a given dataframe is included in a matched dataframe.
remove_oversea_areas –

Remove plants outside continental Europe such as the Canarian Islands etc.
rescale_capacities_to_country_totals –

Returns a extra column 'Scaled Capacity' with an up or down scaled capacity
scale_to_net_capacities –
set_denmark_region_id –

Used to set the Region column to DKE/DKW (East/West) for electricity models
set_known_retire_years –

Integrate known retire years, e.g. for German nuclear plants with fixed

PLZ_to_LatLon_map ¶

PLZ_to_LatLon_map()

aggregate_VRE_by_commissioning_year ¶

aggregate_VRE_by_commissioning_year(df, target_fueltypes=None, agg_geo_by=None)

Aggregate the vast number of VRE (e.g. vom data.OPSD_VRE()) units to one specific (Fueltype + Technology) cohorte per commissioning year.

Parameters:

df (DataFrame) –

DataFrame containing the data to aggregate
target_fueltypes (list, default: None ) –

list of fueltypes to be aggregated (Others are cut!)
agg_by_geo (str) –

How to deal with lat/lon positions. Allowed: NoneType : Do not show geoposition at all 'mean' : Average geoposition 'wm' : Average geoposition weighted by capacity

aggregate_VRE_by_commyear ¶

aggregate_VRE_by_commyear(df, config=None)

derive_vintage_cohorts_from_statistics ¶

derive_vintage_cohorts_from_statistics(df, base_year=2015, config=None)

This function assumes an age-distribution for given capacity statistics and returns a df, containing how much of capacity has been built for every year.

extend_by_VRE ¶

extend_by_VRE(df, config=None, base_year=2017, prune_beyond=True)

Extends a given reduced dataframe by externally given VREs.

Parameters:

df (DataFrame) –

The dataframe to be extended
base_year (int, default: 2017 ) –

Needed for deriving cohorts from IRENA's capacity statistics

Returns:

df ( DataFrame ) –

Extended dataframe

extend_by_non_matched ¶

extend_by_non_matched(df, extend_by, label=None, query=None, aggregate_added_data=True, config=None, **aggkwargs)

Returns the matched dataframe with additional entries of non-matched powerplants of a reliable source.

Parameters:

df (DataFrame) –

Already matched dataset which should be extended
extend_by (DataFrame | str) –

Database which is partially included in the matched dataset, but which should be included totally. If str is passed, is will be used to call the corresponding data from data.py
label (str, default: None ) –

Column name of the additional database within the matched dataset, this string is used if the columns of the additional database do not correspond to the ones of the dataset

fill_missing_commissioning_years ¶

fill_missing_commissioning_years(df)

Fills the empty commissioning years with averages.

fill_missing_commyears ¶

fill_missing_commyears(df)

fill_missing_decommissioning_years ¶

fill_missing_decommissioning_years(df, config=None)

Function which sets/fills a column 'DateOut' with roughly estimated values for decommissioning years, based on the estimated lifetimes per Fueltype given in the config and corresponding commissioning years. Note that the latter is filled up using fill_missing_commissioning_years.

fill_missing_decommyears ¶

fill_missing_decommyears(df, config=None)

fill_missing_duration ¶

fill_missing_duration(df)

gross_to_net_factors ¶

gross_to_net_factors(reference='opsd', aggfunc='median', return_entire_data=False)

isin ¶

isin(df, matched, label=None)

Checks if a given dataframe is included in a matched dataframe.

Parameters:

df (DataFrame) –

The dataframe to be checked
matched (DataFrame) –

The matched dataframe

Returns:

bool –

True if all dataframes are included in the matched dataframe, False otherwise

remove_oversea_areas ¶

remove_oversea_areas(df, lat=[36, 72], lon=[-10.6, 31])

Remove plants outside continental Europe such as the Canarian Islands etc.

rescale_capacities_to_country_totals ¶

rescale_capacities_to_country_totals(df, fueltypes=None)

Returns a extra column 'Scaled Capacity' with an up or down scaled capacity in order to match the statistics of the ENTSOe country totals. For every country the information about the total capacity of each fueltype is given. The scaling factor is determined by the ratio of the aggregated capacity of the fueltype within each country and the ENTSOe statistics about the fueltype capacity total within each country.

Parameters:

df (DataFrame) –

Data set that should be modified
fueltype (str or list of strings) –

fueltype that should be scaled

scale_to_net_capacities ¶

scale_to_net_capacities(df, is_gross=True, catch_all=True)

set_denmark_region_id ¶

set_denmark_region_id(df)

Used to set the Region column to DKE/DKW (East/West) for electricity models based on lat,lon-coordinates and a heuristic for unknowns.

set_known_retire_years ¶

set_known_retire_years(df)

Integrate known retire years, e.g. for German nuclear plants with fixed decommissioning dates.

plot ¶

Functions:

boxplot_gross_to_net –
boxplot_matchcount –

Makes a boxplot for the capacities grouped by the number of matches.
country_totals_hbar –
draw_basemap –
factor_comparison –
fueltype_and_country_totals_bar –
fueltype_stats –
fueltype_totals_bar –
gather_nrows_ncols –

Derives [nrows, ncols] based on x plots, so that a subplot looks nicely.
make_handler_map_to_scale_circles_as_in –
make_legend_circles_for –
powerplant_map –

Attributes:

cartopy_present –

cartopy_present `module-attribute` ¶

cartopy_present = True

boxplot_gross_to_net ¶

boxplot_gross_to_net(axes_style='darkgrid', **kwargs)

boxplot_matchcount ¶

boxplot_matchcount(df)

Makes a boxplot for the capacities grouped by the number of matches. Attention: Currently only works for the full dataset with original names as the last columns.

country_totals_hbar ¶

country_totals_hbar(dfs, keys=None, exclude_fueltypes=['Solar', 'Wind'], figsize=(7, 5), unit='GW', axes_style='whitegrid')

draw_basemap ¶

draw_basemap(resolution=True, ax=None, country_linewidth=0.3, coast_linewidth=0.4, zorder=None, fillcontinents=True, **kwds)

factor_comparison ¶

factor_comparison(dfs, keys=None, figsize=(12, 9))

fueltype_and_country_totals_bar ¶

fueltype_and_country_totals_bar(dfs, keys=None, figsize=(18, 8))

fueltype_stats ¶

fueltype_stats(df)

fueltype_totals_bar ¶

fueltype_totals_bar(dfs, keys=None, figsize=(7, 4), unit='GW', last_as_marker=False, axes_style='whitegrid', exclude=[], **kwargs)

gather_nrows_ncols ¶

gather_nrows_ncols(x, orientation='landscape')

Derives [nrows, ncols] based on x plots, so that a subplot looks nicely.

Parameters:

x (int, Number of subplots between [0, 42]) –

make_handler_map_to_scale_circles_as_in ¶

make_handler_map_to_scale_circles_as_in(ax, dont_resize_actively=False)

make_legend_circles_for ¶

make_legend_circles_for(sizes, scale=1.0, **kw)

powerplant_map ¶

powerplant_map(df, scale=20.0, alpha=0.6, european_bounds=True, fillcontinents=False, legendscale=1, resolution=True, figsize=None, ncol=2, loc='upper left')

Utility Modules¶

utils ¶

cc module-attribute ¶

country_map module-attribute ¶

breakdown_matches ¶

config_filter ¶

convert_alpha2_to_country ¶

convert_country_to_alpha2 ¶

convert_to_short_name ¶

correct_manually ¶

country_alpha2 ¶

fill_geoposition ¶

fun ¶

get_name ¶

get_raw_file ¶

lookup ¶

parmap ¶

parse_Geoposition ¶

parse_string_to_dict ¶

read_csv_if_string ¶

restore_blocks ¶

select_by_projectID ¶

set_column_name ¶

set_uncommon_fueltypes_to_other ¶

to_categorical_columns ¶

to_dict_if_string ¶

to_list_if_other ¶

update_saved_matches_for_ ¶

export ¶

cget module-attribute ¶

fueltype_to_abbrev ¶

map_bus ¶

map_country_bus ¶

store_open_dataset ¶

timestype_to_life ¶

to_TIMES ¶

to_pypsa_names ¶

to_pypsa_network ¶

heuristics ¶

PLZ_to_LatLon_map ¶

aggregate_VRE_by_commissioning_year ¶

aggregate_VRE_by_commyear ¶

derive_vintage_cohorts_from_statistics ¶

extend_by_VRE ¶

extend_by_non_matched ¶

fill_missing_commissioning_years ¶

fill_missing_commyears ¶

fill_missing_decommissioning_years ¶

fill_missing_decommyears ¶

fill_missing_duration ¶

gross_to_net_factors ¶

isin ¶

remove_oversea_areas ¶

rescale_capacities_to_country_totals ¶

scale_to_net_capacities ¶

set_denmark_region_id ¶

set_known_retire_years ¶

plot ¶

cartopy_present module-attribute ¶

boxplot_gross_to_net ¶

boxplot_matchcount ¶

country_totals_hbar ¶

draw_basemap ¶

factor_comparison ¶

fueltype_and_country_totals_bar ¶

fueltype_stats ¶

fueltype_totals_bar ¶

gather_nrows_ncols ¶

make_handler_map_to_scale_circles_as_in ¶

make_legend_circles_for ¶

powerplant_map ¶

cc `module-attribute` ¶

country_map `module-attribute` ¶

cget `module-attribute` ¶

cartopy_present `module-attribute` ¶