Column Schemas#
The contents of each data table available with a dataset are described using a ColumnSchema
.
The schema is a collection of ColumnInfo
objects detailing each column,
which includes
Description: A English description of the contents
Type: Type of each record (e.g., integer, string)
Units: Units for the values, if applicable
Required: Whether the column must be present in the table
Monotonic: Whether values in never decrease between sequential rows
Using a Column Schema#
ColumnSchema
stored inside the HDF5 and Parquet files
provided by the battery data toolkit are used to describe existing and validating new data.
List the columns names with columns
attribute
and access information for a single column through the get item method:
data = BatteryDataset.from_battdat_hdf(out_path)
schema = data.schemas['eis_data'] # ColumnSchema for the ``eis_data`` table
print(schema['test_id'].model_dump())
The above code prints the data for a specific column.
{'required': True,
'type': <DataType.INTEGER: 'integer'>,
'description': 'Integer used to identify rows belonging to the same experiment.',
'units': None,
'monotonic': False}
Use the validate_dataframe()
to check
if a dataframe matches requirements for each column.
Pre-defined Schema#
The battery-data-toolkit provides schemas for common types of data (e.g., cycling data for single cells, EIS).
RawData
#
Source Object: battdat.schemas.column.RawData
Data describing measurements of a single cell
Column |
Description |
Units |
---|---|---|
file_number |
Which file a row came from, if the data was originally split into multiple files |
None |
state |
Whether the battery is being charged, discharged or otherwise. |
None |
method |
Method to control the charge or discharge |
None |
cycle_number |
Index of the testing cycle, starting at 0. |
None |
step_index |
Index of the step number within a testing cycle. A step change is defined by a change states between charging, discharging, or resting. |
None |
substep_index |
Change of the control method within a cycle. |
None |
test_time |
Time from the beginning of measurements |
s |
voltage |
Measured voltage of the system |
V |
current |
Measured current of the system. Positive current represents the battery charging. |
A |
internal_resistance |
Internal resistance of the battery. |
ohm |
time |
Time as a UNIX timestamp. |
s |
temperature |
Temperature of the battery |
C |
cycle_time |
Time from the beginning of a cycle |
s |
cycle_capacity |
Cumulative change in amount of charge transferred from a battery since the start of a cycle. Positive values indicate the battery has discharged since the start of the cycle. |
A-hr |
cycle_energy |
Cumulative change in amount of energy transferred from a battery since the start of a cycle. Positive values indicate the battery has discharged since the start of the cycle. |
J |
cycle_capacity_charge |
Cycle capacity computed only during the ‘charging’ phase of a cycle |
A-hr |
cycle_capacity_discharge |
Cycle capacity computed only during the ‘discharging’ phase of a cycle |
A-hr |
CycleLevelData
#
Source Object: battdat.schemas.column.CycleLevelData
Statistics about the performance of a cell over entire cycles
Column |
Description |
Units |
---|---|---|
cycle_number |
Index of the cycle |
None |
cycle_start |
Time since the first data point recorded for this battery for the start of this cycle |
s |
cycle_duration |
Duration of this cycle |
s |
capacity_discharge |
Total amount of electrons released during discharge |
A-hr |
energy_discharge |
Total amount of energy released during discharge |
W-hr |
capacity_charge |
Total amount of electrons stored during charge |
A-hr |
energy_charge |
Total amount of energy stored during charge |
W-hr |
coulomb_efficiency |
Fraction of electric charge that is lost during charge and recharge |
% |
energy_efficiency |
Amount of energy lost during charge and discharge |
None |
discharge_V_average |
Average voltage during discharging |
V |
charge_V_average |
Average voltage during charge |
V |
V_maximum |
Maximum voltage during cycle |
V |
V_minimum |
Minimum voltage during cycle |
V |
discharge_I_average |
Average current during discharge |
A |
charge_I_average |
Average current during charge |
A |
temperature_minimum |
Minimum observed battery temperature during cycle |
C |
temperature_maximum |
Maximum observed battery temperature during cycle |
C |
temperature_average |
Average observed battery temperature during cycle |
C |
EISData
#
Source Object: battdat.schemas.eis.EISData
Measurements for a specific EIS test
Column |
Description |
Units |
---|---|---|
test_id |
Integer used to identify rows belonging to the same experiment. |
None |
test_time |
Time from the beginning of measurements. |
s |
time |
Time as a UNIX timestamp. Assumed to be in UTC |
None |
frequency |
Applied frequency |
Hz |
z_real |
Real component of impedance |
Ohm |
z_imag |
Imaginary component of impedance |
Ohm |
z_mag |
Magnitude of impedance |
Ohm |
z_phase |
Phase angle of the impedance |
Degree |
Defining a New Column Schema#
Document a new type of data by either creating a subclass of ColumnSchema
or adding individual columns to an existing schema.
from battdat.schemas.column import RawData, ColumnInfo
schema = RawData() # Schema for sensor measurements of cell
schema.extra_columns['room_temp'] = ColumnInfo(
description='Temperature of the room as measured by the HVAC system',
units='C', data_type='float',
)