compare_two_datasets

compare_two_datasets#

powerplantmatching.matching.compare_two_datasets(dfs, labels, country_wise=True, config=None, **dukeargs)#

Duke-based horizontal match of two databases. Returns the matched dataframe including only the matched entries in a multi-indexed pandas.Dataframe. Compares all properties of the given columns [‘Name’,’Fueltype’, ‘Technology’, ‘Country’, ‘Capacity’,’lat’, ‘lon’] in order to determine the same powerplant in different two datasets. The match is in one-to-one mode, that is every entry of the initial databases has maximally one link in order to obtain unique entries in the resulting dataframe. Attention: When aborting this command, the duke process will still continue in the background, wait until the process is finished before restarting.

Parameters:
  • dfs (list of pandas.Dataframe or strings) – dataframes or csv-files to use for the matching

  • labels (list of strings) – Names of the databases for the resulting dataframe