Skip to content

Supported Datasets

As part of the BabelBetes project, we conducted a survey of publicly available diabetes CGM and insulin pump datasets — see janvv/awesome_diabetes_cgm_csii_datasets for the full list. BabelBetes currently normalizes 9 datasets from the JAEB database.

Figure: Days worth of complete data including CGM, basal and bolus data for supported studies. Overall, we have approximately normalized and extracted half a million days of data.

For each of these studies, we've spent hundreds of hours analyzing the data to ensure that the class correctly loads and extracts the data. Please refer to the individual study pages for a summary of the analysis and findings. While we operated with great care, some assumptions had to be made and other details remain unknown, which are also documented.

Analysis & Documentation Link Supported Version/Retrieval Date Folder Name * Note
Flair JAEB -/April 17th, 2024 FLAIRPublicDataSet.zip ⚠️We don't support the newest version (September, 2024) where insulin pump data was removed from the dataset.
DCLP3 JAEB Release 3 / 2022-08-04 DCLP3 Public Dataset - Release 3 - 2022-08-04.zip -
DCLP5 JAEB -/April 17th, 2024 DCLP5_Dataset_2022-01-20-5e0f3b16-c890-4ace-9e3b-531f3687cf53.zip -
IOBP2 JAEB -/April 17th, 2024 IOBP2 RCT Public Dataset.zip -
PEDAP JAEB Release 4/2025-04-10 PEDAP Public Dataset - Release 4 - 2025-04-10.zip Our investigation resulted in two updated versions: Release 3 (updated patient ids), Release 4 with complete basal data.
T1DEXI JAEB -/October 1st, 2022 T1DEXI - DATA FOR UPLOAD.zip
T1DEXIP JAEB -/March 16th, 2023 T1DEXIP - DATA FOR UPLOAD.zip
REPLACE BG JAEB -/February 2nd, 2025 REPLACE-BG Dataset-79f6bdc8-3c51-4736-a39f-c4c0f71d45e5 ⚠️The currently hosted version misses the Basal file.
Loop JAEB 2023-01-31 Loop study public dataset 2023-01-31.zip Due to the extensive file sizes, we convert the CSV files to parquet files in a temporary folder to allow parallel processing. You can delete this folder afterwards.

* We have only tested our code on the respective versions.

If you are encountering problems with running the datasets, feel free to reach out to us.