Dataset (batdata.data)#

Objects that represent battery datasets

class batdata.data.BatteryDataset(metadata: BatteryMetadata | dict | None = None, raw_data: DataFrame | None = None, cycle_stats: DataFrame | None = None, eis_data: DataFrame | None = None)#

Bases: object

Holder for all data associated with tests for a battery.

Attributes of this class define different view of the data (e.g., raw time-series, per-cycle statistics) or different types of data (e.g., EIS) along with the metadata for the class

I/O with BatteryDataFrame#

This data frame provides I/O operations that store and retrieve the battery metadata into particular formats. The operations are named [to|from]_batdata_[format], where format could be one of

  • hdf: Data is stored the “table” format from PyTables. Metadata are stored as an attribute to the

  • dict: Data as a Python dictionary object with two keys: “metadata” for the battery metadata and “data” with the cycling data in “list” format ({“column”->[“values”]})

  • parquet: Data into a directory of Parquet files for each types of data. The metadata for the dataset will be saved as well

Many of methods use existing Pandas implementations of I/O operations, but with slight modifications to encode the metadata and to ensure a standardized format.

param metadata:

Metadata that describe the battery construction, data provenance and testing routines

param raw_data:

Time-series data of the battery state

param cycle_stats:

Summaries of each cycle

param eis_data:

EIS data taken at multiple times

classmethod all_cells_from_batdata_hdf(path: str | Path, subsets: Collection[str] | None = None) Iterator[Tuple[str, BatteryDataset]]#

Iterate over all cells in an HDF file

Parameters:
  • path – Path to the HDF file

  • subsets – Which subsets of data to read from the data file (e.g., raw_data, cycle_stats)

Yields:
  • Name of the cell

  • Cell data

cycle_stats: DataFrame | None = None#

Summary statistics of each cycle

eis_data: DataFrame | None = None#

Electrochemical Impedance Spectroscopy (EIS) data

classmethod from_batdata_dict(d)#

Read battery data and metadata from a dictionary format

classmethod from_batdata_hdf(path_or_buf: str | Path | HDFStore, subsets: Collection[str] | None = None, prefix: str | None | int = None) BatteryDataset#

Read the battery data from an HDF file

Use all_cells_from_batdata_hdf() to read all datasets from a file.

Parameters:
  • path_or_buf – File path or HDFStore object

  • subsets – Which subsets of data to read from the data file (e.g., raw_data, cycle_stats)

  • prefix – (str) Prefix designating which battery extract from this file, or (int) index within the list of available prefixes, sorted alphabetically. The default is to read the default prefix (None).

classmethod from_batdata_parquet(path: str | Path, subsets: Collection[str] | None = None)#

Read the battery data from an HDF file

Parameters:
  • path – Path to a directory containing parquet files for a specific batter

  • subsets – Which subsets of data to read from the data file (e.g., raw_data, cycle_stats)

static get_metadata_from_hdf5(path: str | Path) BatteryMetadata#

Get battery metadata from an HDF file without reading the data

Parameters:

path – Path to the HDF5 file

Returns:

Metadata from this file

static get_metadata_from_parquet(path: str | Path) BatteryMetadata#

Get battery metadata from a directory of parquet files without reading them

Parameters:

path – Path to the directory of Parquet files

Returns:

Metadata from the files

static inspect_batdata_hdf(path_or_buf: str | Path | HDFStore) tuple[BatteryMetadata, Set[str | None]]#

Extract the battery data and the prefixes of cells contained within an HDF5 file

Parameters:

path_or_buf – Path to the HDF5 file, or HDFStore object

Returns:

  • Metadata from this file

  • List of names of batteries stored within the file

metadata: BatteryMetadata#

Metadata for the battery construction and testing

raw_data: DataFrame | None = None#

Time-series data capturing the state of the battery as a function of time

to_batdata_dict() dict#

Generate data in dictionary format

Return type:

(dict) Data in dictionary format

to_batdata_hdf(path_or_buf: str | Path | HDFStore, prefix: str | None = None, append: bool = False, complevel: int = 0, complib: str = 'zlib')#

Save the data in the standardized HDF5 file format

This function wraps the to_hdf function of Pandas and supplies fixed values for some options so that the data is written in a reproducible format.

Parameters:
  • path_or_buf – File path or HDFStore object.

  • prefix – Prefix to use to differentiate this battery from (optionally) others stored in this HDF5 file

  • append – Whether to clear any existing data in the HDF5 file before writing

  • complevel – Specifies a compression level for data. A value of 0 disables compression.

  • complib – Specifies the compression library to be used.

to_batdata_parquet(path: Path | str, overwrite: bool = True) Dict[str, Path]#

Write battery data to a directory of Parquet files

Parameters:
  • path – Path in which to write to

  • overwrite – Whether to overwrite an existing directory

Returns:

Map of the name of the subset to

validate() List[str]#

Validate the data stored in this object

Ensures that the data are valid according to schemas and makes recommendations of improvements that one could make to increase the re-usability of the data.

Returns:

Recommendations to improve data re-use

validate_columns(allow_extra_columns: bool = True)#

Determine whether the column types are appropriate

Parameters:

allow_extra_columns – Whether to allow unexpected columns

Raises

(ValueError): If the dataset fails validation