Reading and Writing Datasets#

The battdat.io module provides tools to read and write from BatteryDataset objects.

Format

Module

Reading

Writing

Arbin

arbin

✔️

✖️

Battery Archive (https://www.batteryarchive.org)

ba

✖️

✔️

Battery Data Hub (https://batterydata.energy.gov)

batterydata

✔️

✖️

HDF5

hdf

✔️

✔️

MACCOR

maccor

✔️

✖️

Parquet

parquet

✔️

✔️

Note

The parquet and HDF5 formats write to the battery-data-toolkit file formats.

Reading Data#

DatasetReader classes provide the ability to create a dataset through the read_dataset method. The inputs to read_dataset always include a BatteryMetadata object containing information beyond what is available in the files.

Most DatasetReader read data from a filesystem and are based on DatasetFileReader. These readers take list of paths to data files alongside the metadata and also include methods (e.g., group()) to find files:

from battdat.io.batterydata import BDReader

extractor = BDReader(store_all=True)
group = next(extractor.identify_files('./example-path/'))
dataset = extractor.read_dataset(group)

The type of output dataset is defined by the output_class attribute. Most uses of readers do not require modifying this attribute.

Writing Data#

DatasetWriter classes write battdat.data.BatteryDataset objects into forms usable by other tools.

For example, the BatteryArchiveWriter converts the metadata into the schema used by Battery Archive and writes the data into the preferred format: CSV files no longer than 100k rows.

from battdat.io.ba import BatteryArchiveWriter
exporter = BatteryArchiveWriter()
exporter.export(example_data, './to-upload')