Dataset (battdat.data)#

Objects that represent battery datasets

class battdat.data.BatteryDataset(tables: Dict[str, DataFrame], schemas: Dict[str, ColumnSchema], metadata: BatteryMetadata | None = None, check_schemas: bool = True)#

Bases: Mapping[str, DataFrame]

Base class for all battery datasets.

Not to be created directly by users. Defines the functions to validate, read, and write from HDF5 or Parquet files.

Parameters:
  • tables – Subsets which compose this larger dataset

  • metadata – Metadata for the entire dataset

  • schemas – Schemas describing each subset

  • check_schemas – Whether to throw an error if datasets lack a schema

classmethod all_cells_from_hdf(path: str | Path, subsets: Collection[str] | None = None) Iterator[Tuple[str, CellDataset]]#

Iterate over all cells in an HDF file

Parameters:
  • path – Path to the HDF file

  • subsets – Which subsets of data to read from the data file (e.g., raw_data, cycle_stats)

Yields:
  • Name of the cell

  • Cell data

classmethod from_hdf(path_or_buf: str | Path | File, tables: Collection[str] | None = None, prefix: str | int | None = None) BatteryDataset#

Read the battery data from an HDF file

Use all_cells_from_hdf() to read all datasets from a file.

Parameters:
  • path_or_buf – File path or HDFStore object

  • tables – Which subsets of data to read from the data file (e.g., raw_data, cycle_stats)

  • prefix – (str) Prefix designating which battery extract from this file, or (int) index within the list of available prefixes, sorted alphabetically. The default is to read the default prefix (None).

classmethod from_parquet(path: str | Path, subsets: Collection[str] | None = None)#

Read the battery data from an HDF file

Parameters:
  • path – Path to a directory containing parquet files

  • subsets – Which subsets of data to read from the data file (e.g., raw_data, cycle_stats)

static get_metadata_from_hdf5(path: str | Path) BatteryMetadata#

Get battery metadata from an HDF file without reading the data

Parameters:

path – Path to the HDF5 file

Returns:

Metadata from this file

static inspect_hdf(path_or_buf: str | Path | File) tuple[BatteryMetadata, Set[str | None]]#

Extract the battery data and the prefixes of cells contained within an HDF5 file

Parameters:

path_or_buf – Path to the HDF5 file, or HDFStore object

Returns:

  • Metadata from this file

  • List of names of batteries stored within the file (prefixes)

static inspect_parquet(path: str | Path) BatteryMetadata#

Get battery metadata from a directory of parquet files without reading them

Parameters:

path – Path to the directory of Parquet files

Returns:

Metadata from the files

metadata: BatteryMetadata#

Information describing the source of a dataset

schemas: Dict[str, ColumnSchema]#

Schemas describing each dataset

tables: Dict[str, DataFrame]#

Datasets available for users

to_hdf(path_or_buf: str | Path | File, prefix: str | None = None, overwrite: bool = True, complevel: int = 0, complib: str = 'zlib')#

Save the data in the standardized HDF5 file format

This function wraps the to_hdf function of Pandas and supplies fixed values for some options so that the data is written in a reproducible format.

Parameters:
  • path_or_buf – File path or HDFStore object.

  • prefix – Prefix to use to differentiate this battery from (optionally) others stored in this HDF5 file

  • overwrite – Whether to delete an existing HDF5 file

  • complevel – Specifies a compression level for data. A value of 0 disables compression.

  • complib – Specifies the compression library to be used.

to_parquet(path: Path | str, overwrite: bool = True, **kwargs) Dict[str, Path]#

Write battery data to a directory of Parquet files

Keyword arguments are passed to write_table().

Parameters:
  • path – Path in which to write to

  • overwrite – Whether to overwrite an existing directory

Returns:

Map of the name of the subset to

validate() List[str]#

Validate the data stored in this object

Ensures that the data are valid according to schemas and makes recommendations of improvements that one could make to increase the re-usability of the data.

Returns:

Recommendations to improve data re-use

validate_columns(allow_extra_columns: bool = True)#

Determine whether the column types are appropriate

Parameters:

allow_extra_columns – Whether to allow unexpected columns

Raises

(ValueError): If the dataset fails validation

class battdat.data.CellDataset(metadata: BatteryMetadata | dict | None = None, raw_data: DataFrame | None = None, cycle_stats: DataFrame | None = None, eis_data: DataFrame | None = None, schemas: Dict[str, ColumnSchema] | None = None, tables: Dict[str, DataFrame] | None = None)#

Bases: BatteryDataset

Data associated with tests for a single battery cell

Parameters:
  • metadata – Metadata that describe the battery construction, data provenance and testing routines

  • raw_data – Time-series data of the battery state

  • cycle_stats – Summaries of each cycle

  • eis_data – EIS data taken at multiple times

  • schemas – Schemas describing each of the tabular datasets

property cycle_stats: DataFrame | None#

Summary statistics of each cycle

property eis_data: DataFrame | None#

Electrochemical Impedance Spectroscopy (EIS) data

property raw_data: DataFrame | None#

Time-series data capturing the state of the battery as a function of time