Data Processing Modules#

powerplantmatching.data Module#

Collection of power plant data bases and statistical data

Functions#

BEYONDCOAL([raw, update, config])

Importer for the BEYOND COAL database.

BNETZA([raw, update, config, header, ...])

Importer for the database put together by Germany's 'Federal Network Agency' (dt.

CARMA([raw, update, config])

Importer for the Carma database.

Capacity_stats([raw, config, update, ...])

Standardize the aggregated capacity statistics provided by the ENTSO-E.

ENTSOE([raw, update, config, entsoe_token])

Importer for the list of installed generators provided by the ENTSO-E Transparency Project.

ENTSOE_EIC([raw, update, config, entsoe_token])

Importer for the meta data given for each ENTSOE entry.

EXTERNAL_DATABASE([raw, update, config])

Importer for external custom databases. :param raw: Whether to return the original dataset :type raw: boolean, default False :param update: Whether to update the data from the url. :type update: bool, default True (unused) :param config: Add custom specific configuration, e.g. powerplantmatching.config.get_config(target_countries='Italy'), defaults to powerplantmatching.config.get_config() :type config: dict, default None.

GBPT([raw, update, config])

Importer for the global bioenergy powerplant tracker from global energy monitor.

GCPT([raw, update, config])

Importer for the global coal powerplant tracker from global energy monitor.

GEM([raw, update, config])

Get the combined dataset of all GEM (https://globalenergymonitor.org/) datasets.

GEM_GGPT(*args, **kwargs)

Deprecated since version 0.5.5.

GEO([raw, update, config])

Importer for the GEO database.

GGPT([raw, update, config])

Importer for the global gas powerplant tracker from global energy monitor.

GGTPT([raw, update, config])

Importer for the global geothermal powerplant tracker from global energy monitor.

GHPT([raw, update, config])

Importer for the global gas powerplant tracker from global energy monitor.

GNPT([raw, update, config])

Importer for the global nuclear energy powerplant tracker from global energy monitor.

GPD([raw, update, config, filter_other_dbs])

Importer for the Global Power Plant Database.

GSPT([raw, update, config])

Importer for the global solar powerplant tracker from global energy monitor.

GWPT([raw, update, config])

Importer for the global wind powerplant tracker from global energy monitor.

IRENASTAT([raw, update, config])

Importer for the IRENASTAT renewable capacity statistics.

IWPDCY([config])

This data is not yet available.

JRC([raw, update, config])

Importer for the JRC Hydro-power plants database retrieves from energy-modelling-toolkit/hydro-power-database.

OPSD([raw, update, statusDE, config])

Importer for the OPSD (Open Power Systems Data) database.

OPSD_VRE([raw, update, config])

Importer for the OPSD (Open Power Systems Data) renewables (VRE) database.

OPSD_VRE_country(country[, raw, update, config])

Get country specific data from OPSD for renewables, if available.

UBA([raw, update, config, header, ...])

Importer for the UBA Database.

WEPP([raw, config])

Importer for the standardized WEPP (Platts, World Elecrtric Power Plants Database).

WIKIPEDIA([raw, update, config])

Importer for the WIKIPEDIA nuclear power plant database.

cget(*args, **kw)

clean_name(df[, config])

Clean the name of a power plant list.

config_filter(df, config)

Convenience function to filter data source according to the config.yaml file.

convert_to_short_name(df)

correct_manually(df, name[, config])

Update powerplant data based on stored corrections in powerplantmatching/data/in/manual_corrections.csv.

debug(msg, *args, **kwargs)

Log 'msg % args' with severity 'DEBUG'.

deprecated([deprecated_in, removed_in, ...])

Decorate a function to signify its deprecation

fill_geoposition(df[, use_saved_locations, ...])

Fill missing 'lat' and 'lon' values.

gather_fueltype_info(df[, search_col, config])

Parses in a set of columns for distinct fueltype specifications.

gather_set_info(df[, search_col, config])

Parses in a set of columns for distinct Set specifications.

gather_specifications(df[, target_columns, ...])

Parse columns to collect representative keys.

gather_technology_info(df[, search_col, config])

Parses in a set of columns for distinct technology specifications.

get_config([filename])

Import the default configuration file and update custom settings.

get_raw_file(name[, update, config, ...])

scale_to_net_capacities(df[, is_gross, ...])

set_column_name(df, name)

Helper function to associate dataframe with a name.

powerplantmatching.cleaning Module#

Functions for vertically cleaning a dataset.

Functions#

aggregate_units(df[, dataset_name, ...])

Vertical cleaning of the database.

clean_name(df[, config])

Clean the name of a power plant list.

clean_powerplantname(df[, config])

Deprecated since version 5.0.

clean_technology(df[, generalize_hydros])

Clean the 'Technology' by condensing down the value into one claim.

cliques(df, dataduplicates)

Locate cliques of units which are determined to belong to the same powerplant.

config_target_key(column)

Convert a column name to the key that is used to specify the target values in the config.

deprecated([deprecated_in, removed_in, ...])

Decorate a function to signify its deprecation

duke(datasets[, labels, singlematch, ...])

Run duke in different modes (Deduplication or Record Linkage Mode) to either locate duplicates in one database or find the similar entries in two different datasets.

gather_and_replace(df, mapping)

Search for patterns in multiple columns and return a series of represantativ keys.

gather_fueltype_info(df[, search_col, config])

Parses in a set of columns for distinct fueltype specifications.

gather_set_info(df[, search_col, config])

Parses in a set of columns for distinct Set specifications.

gather_specifications(df[, target_columns, ...])

Parse columns to collect representative keys.

gather_technology_info(df[, search_col, config])

Parses in a set of columns for distinct technology specifications.

get_config([filename])

Import the default configuration file and update custom settings.

get_name(df)

Helper function to associate dataframe with a name.

get_obj_if_Acc(obj)

mode(x)

Get the most common value of a series.

set_column_name(df, name)

Helper function to associate dataframe with a name.

powerplantmatching.matching Module#

Functions for linking and combining different datasets

Functions#

best_matches(links)

Subsequent to duke() with singlematch=True.

clean_technology(df[, generalize_hydros])

Clean the 'Technology' by condensing down the value into one claim.

combine_multiple_datasets(datasets[, ...])

Duke-based horizontal match of multiple databases.

compare_two_datasets(dfs, labels[, ...])

Duke-based horizontal match of two databases.

cross_matches(sets_of_pairs[, labels])

Combines multiple sets of pairs and returns one consistent dataframe.

duke(datasets[, labels, singlematch, ...])

Run duke in different modes (Deduplication or Record Linkage Mode) to either locate duplicates in one database or find the similar entries in two different datasets.

get_config([filename])

Import the default configuration file and update custom settings.

get_name(df)

Helper function to associate dataframe with a name.

get_obj_if_Acc(obj)

link_multiple_datasets(datasets, labels[, ...])

Duke-based horizontal match of multiple databases.

parmap(f, arg_list[, config])

Parallel mapping function.

read_csv_if_string(df)

Convenience function to import powerplant data source if a string is given.

reduce_matched_dataframe(df[, ...])

Reduce a matched dataframe to a unique set of columns.

powerplantmatching.collection Module#

Processed datasets of merged and/or adjusted data

Functions#

aggregate_units(df[, dataset_name, ...])

Vertical cleaning of the database.

collect(datasets[, update, reduced, config])

Return the collection for a given list of datasets in matched or reduced form.

combine_multiple_datasets(datasets[, ...])

Duke-based horizontal match of multiple databases.

deprecated([deprecated_in, removed_in, ...])

Decorate a function to signify its deprecation

extend_by_VRE(df[, config, base_year, ...])

Extends a given reduced dataframe by externally given VREs.

extend_by_non_matched(df, extend_by[, ...])

Returns the matched dataframe with additional entries of non-matched powerplants of a reliable source.

get_config([filename])

Import the default configuration file and update custom settings.

matched_data([config, update, from_url, ...])

Deprecated since version 5.5.

parmap(f, arg_list[, config])

Parallel mapping function.

powerplants([config, config_update, update, ...])

Return the full matched dataset including all data sources listed in config.yaml/matching_sources.

projectID_to_dict(df)

Convenience function to convert string of dict to dict type

reduce_matched_dataframe(df[, ...])

Reduce a matched dataframe to a unique set of columns.

set_column_name(df, name)

Helper function to associate dataframe with a name.

set_uncommon_fueltypes_to_other(df[, ...])

Replace uncommon fueltype specifications as by 'Other'.

to_dict_if_string(s)

Convenience function to ensure dict-like output