Data Processing Modules

Contents

Data Processing Modules#

powerplantmatching.data Module#

Collection of power plant data bases and statistical data

Functions#

`BEYONDCOAL`([raw, update, config])	Importer for the BEYOND COAL database.
`BNETZA`([raw, update, config, header, ...])	Importer for the database put together by Germany's 'Federal Network Agency' (dt.
`CARMA`([raw, update, config])	Importer for the Carma database.
`Capacity_stats`([raw, config, update, ...])	Standardize the aggregated capacity statistics provided by the ENTSO-E.
`ENTSOE`([raw, update, config, entsoe_token])	Importer for the list of installed generators provided by the ENTSO-E Transparency Project.
`ENTSOE_EIC`([raw, update, config, entsoe_token])	Importer for the meta data given for each ENTSOE entry.
`EXTERNAL_DATABASE`([raw, update, config])	Importer for external custom databases. :param raw: Whether to return the original dataset :type raw: boolean, default False :param update: Whether to update the data from the url. :type update: bool, default True (unused) :param config: Add custom specific configuration, e.g. powerplantmatching.config.get_config(target_countries='Italy'), defaults to powerplantmatching.config.get_config() :type config: dict, default None.
`GBPT`([raw, update, config])	Importer for the global bioenergy powerplant tracker from global energy monitor.
`GCPT`([raw, update, config])	Importer for the global coal powerplant tracker from global energy monitor.
`GEM`([raw, update, config])	Get the combined dataset of all GEM (https://globalenergymonitor.org/) datasets.
`GEM_GGPT`(args, *kwargs)	Deprecated since version 0.5.5.
`GEO`([raw, update, config])	Importer for the GEO database.
`GGPT`([raw, update, config])	Importer for the global gas powerplant tracker from global energy monitor.
`GGTPT`([raw, update, config])	Importer for the global geothermal powerplant tracker from global energy monitor.
`GHPT`([raw, update, config])	Importer for the global gas powerplant tracker from global energy monitor.
`GNPT`([raw, update, config])	Importer for the global nuclear energy powerplant tracker from global energy monitor.
`GPD`([raw, update, config, filter_other_dbs])	Importer for the Global Power Plant Database.
`GSPT`([raw, update, config])	Importer for the global solar powerplant tracker from global energy monitor.
`GWPT`([raw, update, config])	Importer for the global wind powerplant tracker from global energy monitor.
`IRENASTAT`([raw, update, config])	Importer for the IRENASTAT renewable capacity statistics.
`IWPDCY`([config])	This data is not yet available.
`JRC`([raw, update, config])	Importer for the JRC Hydro-power plants database retrieves from energy-modelling-toolkit/hydro-power-database.
`OPSD`([raw, update, statusDE, config])	Importer for the OPSD (Open Power Systems Data) database.
`OPSD_VRE`([raw, update, config])	Importer for the OPSD (Open Power Systems Data) renewables (VRE) database.
`OPSD_VRE_country`(country[, raw, update, config])	Get country specific data from OPSD for renewables, if available.
`UBA`([raw, update, config, header, ...])	Importer for the UBA Database.
`WEPP`([raw, config])	Importer for the standardized WEPP (Platts, World Elecrtric Power Plants Database).
`WIKIPEDIA`([raw, update, config])	Importer for the WIKIPEDIA nuclear power plant database.
`cget`(args, *kw)
`clean_name`(df[, config])	Clean the name of a power plant list.
`config_filter`(df, config)	Convenience function to filter data source according to the config.yaml file.
`convert_to_short_name`(df)
`correct_manually`(df, name[, config])	Update powerplant data based on stored corrections in powerplantmatching/data/in/manual_corrections.csv.
`debug`(msg, args, *kwargs)	Log 'msg % args' with severity 'DEBUG'.
`deprecated`([deprecated_in, removed_in, ...])	Decorate a function to signify its deprecation
`fill_geoposition`(df[, use_saved_locations, ...])	Fill missing 'lat' and 'lon' values.
`gather_fueltype_info`(df[, search_col, config])	Parses in a set of columns for distinct fueltype specifications.
`gather_set_info`(df[, search_col, config])	Parses in a set of columns for distinct Set specifications.
`gather_specifications`(df[, target_columns, ...])	Parse columns to collect representative keys.
`gather_technology_info`(df[, search_col, config])	Parses in a set of columns for distinct technology specifications.
`get_config`([filename])	Import the default configuration file and update custom settings.
`get_raw_file`(name[, update, config, ...])
`scale_to_net_capacities`(df[, is_gross, ...])
`set_column_name`(df, name)	Helper function to associate dataframe with a name.

powerplantmatching.cleaning Module#

Functions for vertically cleaning a dataset.

Functions#

`aggregate_units`(df[, dataset_name, ...])	Vertical cleaning of the database.
`clean_name`(df[, config])	Clean the name of a power plant list.
`clean_powerplantname`(df[, config])	Deprecated since version 5.0.
`clean_technology`(df[, generalize_hydros])	Clean the 'Technology' by condensing down the value into one claim.
`cliques`(df, dataduplicates)	Locate cliques of units which are determined to belong to the same powerplant.
`config_target_key`(column)	Convert a column name to the key that is used to specify the target values in the config.
`deprecated`([deprecated_in, removed_in, ...])	Decorate a function to signify its deprecation
`duke`(datasets[, labels, singlematch, ...])	Run duke in different modes (Deduplication or Record Linkage Mode) to either locate duplicates in one database or find the similar entries in two different datasets.
`gather_and_replace`(df, mapping)	Search for patterns in multiple columns and return a series of represantativ keys.
`gather_fueltype_info`(df[, search_col, config])	Parses in a set of columns for distinct fueltype specifications.
`gather_set_info`(df[, search_col, config])	Parses in a set of columns for distinct Set specifications.
`gather_specifications`(df[, target_columns, ...])	Parse columns to collect representative keys.
`gather_technology_info`(df[, search_col, config])	Parses in a set of columns for distinct technology specifications.
`get_config`([filename])	Import the default configuration file and update custom settings.
`get_name`(df)	Helper function to associate dataframe with a name.
`get_obj_if_Acc`(obj)
`mode`(x)	Get the most common value of a series.
`set_column_name`(df, name)	Helper function to associate dataframe with a name.

powerplantmatching.matching Module#

Functions for linking and combining different datasets

Functions#

`best_matches`(links)	Subsequent to duke() with singlematch=True.
`clean_technology`(df[, generalize_hydros])	Clean the 'Technology' by condensing down the value into one claim.
`combine_multiple_datasets`(datasets[, ...])	Duke-based horizontal match of multiple databases.
`compare_two_datasets`(dfs, labels[, ...])	Duke-based horizontal match of two databases.
`cross_matches`(sets_of_pairs[, labels])	Combines multiple sets of pairs and returns one consistent dataframe.
`duke`(datasets[, labels, singlematch, ...])	Run duke in different modes (Deduplication or Record Linkage Mode) to either locate duplicates in one database or find the similar entries in two different datasets.
`get_config`([filename])	Import the default configuration file and update custom settings.
`get_name`(df)	Helper function to associate dataframe with a name.
`get_obj_if_Acc`(obj)
`link_multiple_datasets`(datasets, labels[, ...])	Duke-based horizontal match of multiple databases.
`parmap`(f, arg_list[, config])	Parallel mapping function.
`read_csv_if_string`(df)	Convenience function to import powerplant data source if a string is given.
`reduce_matched_dataframe`(df[, ...])	Reduce a matched dataframe to a unique set of columns.

powerplantmatching.collection Module#

Processed datasets of merged and/or adjusted data

Functions#

`aggregate_units`(df[, dataset_name, ...])	Vertical cleaning of the database.
`collect`(datasets[, update, reduced, config])	Return the collection for a given list of datasets in matched or reduced form.
`combine_multiple_datasets`(datasets[, ...])	Duke-based horizontal match of multiple databases.
`deprecated`([deprecated_in, removed_in, ...])	Decorate a function to signify its deprecation
`extend_by_VRE`(df[, config, base_year, ...])	Extends a given reduced dataframe by externally given VREs.
`extend_by_non_matched`(df, extend_by[, ...])	Returns the matched dataframe with additional entries of non-matched powerplants of a reliable source.
`get_config`([filename])	Import the default configuration file and update custom settings.
`matched_data`([config, update, from_url, ...])	Deprecated since version 5.5.
`parmap`(f, arg_list[, config])	Parallel mapping function.
`powerplants`([config, config_update, update, ...])	Return the full matched dataset including all data sources listed in config.yaml/matching_sources.
`projectID_to_dict`(df)	Convenience function to convert string of dict to dict type
`reduce_matched_dataframe`(df[, ...])	Reduce a matched dataframe to a unique set of columns.
`set_column_name`(df, name)	Helper function to associate dataframe with a name.
`set_uncommon_fueltypes_to_other`(df[, ...])	Replace uncommon fueltype specifications as by 'Other'.
`to_dict_if_string`(s)	Convenience function to ensure dict-like output