Data Processing Modules

powerplantmatching.data Module

Collection of power plant data bases and statistical data

Functions

BEYONDCOAL([raw, update, config])

Importer for the BEYOND COAL database.

BNETZA([raw, update, config, header, ...])

Importer for the database put together by Germany's 'Federal Network Agency' (dt.

CARMA([raw, update, config])

Importer for the Carma database.

Capacity_stats([raw, level, config, update, ...])

Standardize the aggregated capacity statistics provided by the ENTSO-E.

ENTSOE([raw, update, config, entsoe_token])

Importer for the list of installed generators provided by the ENTSO-E Transparency Project.

ESE([raw, update, config])

Importer for the ESE database.

GEO([raw, update, config])

Importer for the GEO database.

GPD([raw, update, config, filter_other_dbs])

Importer for the Global Power Plant Database.

IRENA_stats([config])

Reads the IRENA Capacity Statistics 2017 Database

IWPDCY([config])

This data is not yet available.

JRC([raw, update, config])

Importer for the JRC Hydro-power plants database retrieves from https://github.com/energy-modelling-toolkit/hydro-power-database.

OPSD([rawEU, rawDE, rawDE_withBlocks, ...])

Importer for the OPSD (Open Power Systems Data) database.

OPSD_VRE([raw, update, config])

Importer for the OPSD (Open Power Systems Data) renewables (VRE) database.

OPSD_VRE_country(country[, raw, update, config])

Get country specifig data from OPSD for renewables, if available.

UBA([raw, update, config, header, ...])

Importer for the UBA Database.

WEPP([raw, config])

Importer for the standardized WEPP (Platts, World Elecrtric Power Plants Database).

cget(*args, **kw)

clean_powerplantname(df)

Cleans the column "Name" of the database by deleting very frequent words, numericals and nonalphanumerical characters of the column.

clean_technology(df[, generalize_hydros])

Clean the 'Technology' by condensing down the value into one claim.

config_filter(df[, name, config])

Convenience function to filter data source according to the config.yaml file.

correct_manually(df, name[, config])

Update powerplant data based on stored corrections in powerplantmatching/data/in/manual_corrections.csv.

deprecated([deprecated_in, removed_in, ...])

Decorate a function to signify its deprecation

fill_geoposition(df[, use_saved_locations, ...])

Fill missing 'lat' and 'lon' values.

gather_fueltype_info(df[, search_col])

Parses in search_col columns for distinct coal specifications, e.g.

gather_set_info(df[, search_col])

Parses in search_col columns for distinct set specifications, e.g.

gather_technology_info(df[, search_col, config])

Parses in search_col columns for distinct technology specifications, e.g.

get_config([filename])

Import the default configuration file and update custom settings.

get_raw_file(name[, update, config, ...])

scale_to_net_capacities(df[, is_gross, ...])

set_column_name(df, name)

Helper function to associate dataframe with a name.

powerplantmatching.cleaning Module

Functions for vertically cleaning a dataset.

Functions

aggregate_units(df[, dataset_name, ...])

Vertical cleaning of the database.

clean_powerplantname(df)

Cleans the column "Name" of the database by deleting very frequent words, numericals and nonalphanumerical characters of the column.

clean_technology(df[, generalize_hydros])

Clean the 'Technology' by condensing down the value into one claim.

cliques(df, dataduplicates)

Locate cliques of units which are determined to belong to the same powerplant.

duke(datasets[, labels, singlematch, ...])

Run duke in different modes (Deduplication or Record Linkage Mode) to either locate duplicates in one database or find the similar entries in two different datasets.

gather_fueltype_info(df[, search_col])

Parses in search_col columns for distinct coal specifications, e.g.

gather_set_info(df[, search_col])

Parses in search_col columns for distinct set specifications, e.g.

gather_technology_info(df[, search_col, config])

Parses in search_col columns for distinct technology specifications, e.g.

get_config([filename])

Import the default configuration file and update custom settings.

get_name(df)

Helper function to associate dataframe with a name.

get_obj_if_Acc(obj)

set_column_name(df, name)

Helper function to associate dataframe with a name.

powerplantmatching.matching Module

Functions for linking and combining different datasets

Functions

best_matches(links)

Subsequent to duke() with singlematch=True.

clean_technology(df[, generalize_hydros])

Clean the 'Technology' by condensing down the value into one claim.

combine_multiple_datasets(datasets[, ...])

Duke-based horizontal match of multiple databases.

compare_two_datasets(dfs, labels[, ...])

Duke-based horizontal match of two databases.

cross_matches(sets_of_pairs[, labels])

Combines multiple sets of pairs and returns one consistent dataframe.

duke(datasets[, labels, singlematch, ...])

Run duke in different modes (Deduplication or Record Linkage Mode) to either locate duplicates in one database or find the similar entries in two different datasets.

get_config([filename])

Import the default configuration file and update custom settings.

get_name(df)

Helper function to associate dataframe with a name.

get_obj_if_Acc(obj)

link_multiple_datasets(datasets, labels[, ...])

Duke-based horizontal match of multiple databases.

parmap(f, arg_list[, config])

Parallel mapping function.

read_csv_if_string(df)

Convenience function to import powerplant data source if a string is given.

reduce_matched_dataframe(df[, ...])

Reduce a matched dataframe to a unique set of columns.

powerplantmatching.collection Module

Processed datasets of merged and/or adjusted data

Functions

Collection(**kwargs)

Deprecated since version 0.4.9.

aggregate_units(df[, dataset_name, ...])

Vertical cleaning of the database.

collect(datasets[, update, ...])

Return the collection for a given list of datasets in matched or reduced form.

combine_multiple_datasets(datasets[, ...])

Duke-based horizontal match of multiple databases.

deprecated([deprecated_in, removed_in, ...])

Decorate a function to signify its deprecation

extend_by_VRE(df[, config, base_year, ...])

Extends a given reduced dataframe by externally given VREs.

extend_by_non_matched(df, extend_by[, ...])

Returns the matched dataframe with additional entries of non-matched powerplants of a reliable source.

get_config([filename])

Import the default configuration file and update custom settings.

matched_data([config, stored, update, ...])

Return the full matched dataset including all data sources listed in config.yaml/matching_sources.

parmap(f, arg_list[, config])

Parallel mapping function.

projectID_to_dict(df)

Convenience function to convert string of dict to dict type

reduce_matched_dataframe(df[, ...])

Reduce a matched dataframe to a unique set of columns.

set_column_name(df, name)

Helper function to associate dataframe with a name.

set_uncommon_fueltypes_to_other(df[, ...])

Replace uncommon fueltype specifications as by 'Other'.

to_dict_if_string(s)

Convenience function to ensure dict-like output