Download

StereoPep datasets are freely available. Choose a format below:

Dataset

StereoPep

Regression Retention Time

StereoPep is a benchmark dataset of peptides with experimentally measured retention times, explicitly designed to evaluate stereoisomer-aware models. It includes paired D/L-Phe diastereomers to probe whether models can distinguish chiral variants that differ only in the configuration of a single amino acid.

Property Train / Val Test Total
Peptides46,0632,72648,789
Diastereomer Pairs8,3395438,882
Tasks5 (point mutations, diastereomer change, diastereomer addition, generation, retention time prediction)

Data Format

Each dataset is provided as a CSV file with the following columns:

smiles,sequence,stereo_label,property_1,property_2,...,split
CC[C@@H](N)C(=O)...,ACDEF,L,0.82,1.45,...,train
CC[C@H](N)C(=O)...,ACDEF,D,0.21,1.45,...,train