BLOG


LAION-5B: A NEW ERA OF OPEN LARGE-SCALE MULTI-MODAL DATASETS

by: Romain Beaumont, 3 Mar, 2022


We present a dataset of 5,85 billion CLIP-filtered image-text pairs, 14x bigger than LAION-400M, previously the biggest openly accessible image-text dataset in the world. Authors: Christoph Schuhmann, Richard Vencu, Romain Beaumont, Theo Coombes, Cade Gordon, Aarush Katta, Robert Kaczmarczyk, Jenia ...

LAION-400-MILLION OPEN DATASET

by: Christoph Schuhmann, 8 Aug, 2021


We present LAION-400M: 400M English (image, text) pairs Sponsors We made it so far due to the generosity of these donors: doodlebot.ai Gentec Data the-eye.eu Concept and Content The LAION-400M dataset is entirely openly, freely accessible. WARNING: be aware that this large-scale dataset...