Column Schemas#

The contents of each data table available with a dataset are described using a ColumnSchema. The schema is a collection of ColumnInfo objects detailing each column, which includes

  1. Description: A English description of the contents

  2. Type: Type of each record (e.g., integer, string)

  3. Units: Units for the values, if applicable

  4. Required: Whether the column must be present in the table

  5. Monotonic: Whether values in never decrease between sequential rows

Using a Column Schema#

ColumnSchema stored inside the HDF5 and Parquet files provided by the battery data toolkit are used to describe existing and validating new data.

List the columns names with columns attribute and access information for a single column through the get item method:

data = BatteryDataset.from_battdat_hdf(out_path)
schema = data.schemas['eis_data']  # ColumnSchema for the ``eis_data`` table
print(schema['test_id'].model_dump())

The above code prints the data for a specific column.

{'required': True,
 'type': <DataType.INTEGER: 'integer'>,
 'description': 'Integer used to identify rows belonging to the same experiment.',
 'units': None,
 'monotonic': False}

Use the validate_dataframe() to check if a dataframe matches requirements for each column.

Pre-defined Schema#

The battery-data-toolkit provides schemas for common types of data (e.g., cycling data for single cells, EIS).

RawData#

Source Object: battdat.schemas.column.RawData

Data describing measurements of a single cell

Column

Description

Units

file_number

Which file a row came from, if the data was originally split into multiple files

None

state

Whether the battery is being charged, discharged or otherwise.

None

method

Method to control the charge or discharge

None

cycle_number

Index of the testing cycle, starting at 0.

None

step_index

Index of the step number within a testing cycle. A step change is defined by a change states between charging, discharging, or resting.

None

substep_index

Change of the control method within a cycle.

None

test_time

Time from the beginning of measurements

s

voltage

Measured voltage of the system

V

current

Measured current of the system. Positive current represents the battery charging.

A

internal_resistance

Internal resistance of the battery.

ohm

time

Time as a UNIX timestamp.

s

temperature

Temperature of the battery

C

cycle_time

Time from the beginning of a cycle

s

cycle_capacity

Cumulative change in amount of charge transferred from a battery since the start of a cycle. Positive values indicate the battery has discharged since the start of the cycle.

A-hr

cycle_energy

Cumulative change in amount of energy transferred from a battery since the start of a cycle. Positive values indicate the battery has discharged since the start of the cycle.

J

cycle_capacity_charge

Cycle capacity computed only during the ‘charging’ phase of a cycle

A-hr

cycle_capacity_discharge

Cycle capacity computed only during the ‘discharging’ phase of a cycle

A-hr

CycleLevelData#

Source Object: battdat.schemas.column.CycleLevelData

Statistics about the performance of a cell over entire cycles

Column

Description

Units

cycle_number

Index of the cycle

None

cycle_start

Time since the first data point recorded for this battery for the start of this cycle

s

cycle_duration

Duration of this cycle

s

capacity_discharge

Total amount of electrons released during discharge

A-hr

energy_discharge

Total amount of energy released during discharge

W-hr

capacity_charge

Total amount of electrons stored during charge

A-hr

energy_charge

Total amount of energy stored during charge

W-hr

coulomb_efficiency

Fraction of electric charge that is lost during charge and recharge

%

energy_efficiency

Amount of energy lost during charge and discharge

None

discharge_V_average

Average voltage during discharging

V

charge_V_average

Average voltage during charge

V

V_maximum

Maximum voltage during cycle

V

V_minimum

Minimum voltage during cycle

V

discharge_I_average

Average current during discharge

A

charge_I_average

Average current during charge

A

temperature_minimum

Minimum observed battery temperature during cycle

C

temperature_maximum

Maximum observed battery temperature during cycle

C

temperature_average

Average observed battery temperature during cycle

C

EISData#

Source Object: battdat.schemas.eis.EISData

Measurements for a specific EIS test

Column

Description

Units

test_id

Integer used to identify rows belonging to the same experiment.

None

test_time

Time from the beginning of measurements.

s

time

Time as a UNIX timestamp. Assumed to be in UTC

None

frequency

Applied frequency

Hz

z_real

Real component of impedance

Ohm

z_imag

Imaginary component of impedance

Ohm

z_mag

Magnitude of impedance

Ohm

z_phase

Phase angle of the impedance

Degree

Defining a New Column Schema#

Document a new type of data by either creating a subclass of ColumnSchema or adding individual columns to an existing schema.

from battdat.schemas.column import RawData, ColumnInfo

schema = RawData()  # Schema for sensor measurements of cell
schema.extra_columns['room_temp'] = ColumnInfo(
    description='Temperature of the room as measured by the HVAC system',
    units='C', data_type='float',
)