Data Processing Modules# Module#

Collection of power plant data bases and statistical data


BEYONDCOAL([raw, update, config])

Importer for the BEYOND COAL database.

BNETZA([raw, update, config, header, ...])

Importer for the database put together by Germany's 'Federal Network Agency' (dt.

CARMA([raw, update, config])

Importer for the Carma database.

Capacity_stats([raw, config, update, ...])

Standardize the aggregated capacity statistics provided by the ENTSO-E.

ENTSOE([raw, update, config, entsoe_token, ...])

Importer for the list of installed generators provided by the ENTSO-E Transparency Project.

ENTSOE_EIC([raw, update, config, entsoe_token])

Importer for the meta data given for each ENTSOE entry.

EXTERNAL_DATABASE([raw, update, config])

Importer for external custom databases. :param raw: Whether to return the original dataset :type raw: boolean, default False :param update: Whether to update the data from the url. :type update: bool, default True (unused) :param config: Add custom specific configuration, e.g. powerplantmatching.config.get_config(target_countries='Italy'), defaults to powerplantmatching.config.get_config() :type config: dict, default None.

GBPT([raw, update, config])

Importer for the global bioenergy powerplant tracker from global energy monitor.

GCPT([raw, update, config])

Importer for the global coal powerplant tracker from global energy monitor.

GEM([raw, update, config])

Get the combined dataset of all GEM ( datasets.

GEM_GGPT(*args, **kwargs)

GEO([raw, update, config])

Importer for the GEO database.

GGPT([raw, update, config])

Importer for the global gas powerplant tracker from global energy monitor.

GGTPT([raw, update, config])

Importer for the global geothermal powerplant tracker from global energy monitor.

GHPT([raw, update, config])

Importer for the global gas powerplant tracker from global energy monitor.

GNPT([raw, update, config])

Importer for the global nuclear energy powerplant tracker from global energy monitor.

GPD([raw, update, config, filter_other_dbs])

Importer for the Global Power Plant Database.

GSPT([raw, update, config])

Importer for the global solar powerplant tracker from global energy monitor.

GWPT([raw, update, config])

Importer for the global wind powerplant tracker from global energy monitor.

IRENASTAT([raw, update, config])

Importer for the IRENASTAT renewable capacity statistics.


This data is not yet available.

JRC([raw, update, config])

Importer for the JRC Hydro-power plants database retrieves from energy-modelling-toolkit/hydro-power-database.

MASTR([raw, update, config])

Get the Marktstammdatenregister (MaStR) dataset.

OPSD([raw, update, statusDE, config])

Importer for the OPSD (Open Power Systems Data) database.

OPSD_VRE([raw, update, config])

Importer for the OPSD (Open Power Systems Data) renewables (VRE) database.

OPSD_VRE_country(country[, raw, update, config])

Get country specific data from OPSD for renewables, if available.

UBA([raw, update, config, header, ...])

Importer for the UBA Database.

WEPP([raw, config])

Importer for the standardized WEPP (Platts, World Elecrtric Power Plants Database).

WIKIPEDIA([raw, update, config])

Importer for the WIKIPEDIA nuclear power plant database.

cget(*args, **kw)

clean_name(df[, config])

Clean the name of a power plant list.

config_filter(df, config)

Convenience function to filter data source according to the config.yaml file.


correct_manually(df, name[, config])

Update powerplant data based on stored corrections in powerplantmatching/data/in/manual_corrections.csv.

deprecated([deprecated_in, removed_in, ...])

Decorate a function to signify its deprecation

gather_fueltype_info(df[, search_col, config])

Parses in a set of columns for distinct fueltype specifications.

gather_set_info(df[, search_col, config])

Parses in a set of columns for distinct Set specifications.

gather_specifications(df[, target_columns, ...])

Parse columns to collect representative keys.


Import the default configuration file and update custom settings.

get_raw_file(name[, update, config, ...])

scale_to_net_capacities(df[, is_gross, ...])

set_column_name(df, name)

Helper function to associate dataframe with a name. Module#

Functions for vertically cleaning a dataset.


aggregate_units(df[, dataset_name, ...])

Vertical cleaning of the database.

clean_name(df[, config])

Clean the name of a power plant list.

clean_powerplantname(df[, config])

clean_technology(df[, generalize_hydros])

Clean the 'Technology' by condensing down the value into one claim.

cliques(df, dataduplicates)

Locate cliques of units which are determined to belong to the same powerplant.


Convert a column name to the key that is used to specify the target values in the config.

deprecated([deprecated_in, removed_in, ...])

Decorate a function to signify its deprecation

duke(datasets[, labels, singlematch, ...])

Run duke in different modes (Deduplication or Record Linkage Mode) to either locate duplicates in one database or find the similar entries in two different datasets.

gather_and_replace(df, mapping)

Search for patterns in multiple columns and return a series of represantativ keys.

gather_fueltype_info(df[, search_col, config])

Parses in a set of columns for distinct fueltype specifications.

gather_set_info(df[, search_col, config])

Parses in a set of columns for distinct Set specifications.

gather_specifications(df[, target_columns, ...])

Parse columns to collect representative keys.

gather_technology_info(df[, search_col, config])

Parses in a set of columns for distinct technology specifications.


Import the default configuration file and update custom settings.


Helper function to associate dataframe with a name.



Get the most common value of a series.

set_column_name(df, name)

Helper function to associate dataframe with a name.

powerplantmatching.matching Module#

Functions for linking and combining different datasets



Subsequent to duke() with singlematch=True.

clean_technology(df[, generalize_hydros])

Clean the 'Technology' by condensing down the value into one claim.

combine_multiple_datasets(datasets[, ...])

Duke-based horizontal match of multiple databases.

compare_two_datasets(dfs, labels[, ...])

Duke-based horizontal match of two databases.

cross_matches(sets_of_pairs[, labels])

Combines multiple sets of pairs and returns one consistent dataframe.

duke(datasets[, labels, singlematch, ...])

Run duke in different modes (Deduplication or Record Linkage Mode) to either locate duplicates in one database or find the similar entries in two different datasets.


Import the default configuration file and update custom settings.


Helper function to associate dataframe with a name.


link_multiple_datasets(datasets, labels[, ...])

Duke-based horizontal match of multiple databases.

parmap(f, arg_list[, config])

Parallel mapping function.


Convenience function to import powerplant data source if a string is given.

reduce_matched_dataframe(df[, ...])

Reduce a matched dataframe to a unique set of columns.

powerplantmatching.collection Module#

Processed datasets of merged and/or adjusted data


aggregate_units(df[, dataset_name, ...])

Vertical cleaning of the database.

collect(datasets[, update, reduced, config])

Return the collection for a given list of datasets in matched or reduced form.

combine_multiple_datasets(datasets[, ...])

Duke-based horizontal match of multiple databases.

deprecated([deprecated_in, removed_in, ...])

Decorate a function to signify its deprecation

extend_by_VRE(df[, config, base_year, ...])

Extends a given reduced dataframe by externally given VREs.

extend_by_non_matched(df, extend_by[, ...])

Returns the matched dataframe with additional entries of non-matched powerplants of a reliable source.


Import the default configuration file and update custom settings.

matched_data([config, update, from_url, ...])

parmap(f, arg_list[, config])

Parallel mapping function.

parse_string_to_dict(df, cols)

Convenience function to convert string of dict to dict type for specified columns.

powerplants([config, config_update, update, ...])

Return the full matched dataset including all data sources listed in config.yaml/matching_sources.

reduce_matched_dataframe(df[, ...])

Reduce a matched dataframe to a unique set of columns.

set_column_name(df, name)

Helper function to associate dataframe with a name.


Convenience function to ensure dict-like output