NOTES

Welcome to our LAION notes section! Here, you will find quick overviews or work in progress of the recent research by our community!

LAION-Debate: dataset of competitive debates and discussions

by: LAION, 28 Jun, 2024


We’re pleased to announce the World's first Large Competitive Debate Dataset: LAION-Debate. LAION-Debate is a large Competitive debate dataset providing links to Competitive Debate Championships, discussions and prominent speakers intake and conversations posted on YouTube by University of Cambridge...

Call to Build Open Multi-Modal Models for Personal Assistants

by: Christoph Schuhmann, 29 May, 2024


Technologies like the recently introduced GPT-4-OMNI from OpenAI show again the potential which strong multi-modal models might have to positively transform many aspects of our lives. A particularly impressive example of this is in the field of education. Imagine every person in the world having the...

Safety Review for LAION 5B

by: LAION.ai, 19 Dec, 2023


There have been reports in the press about the results of a research project at Stanford University, according to which the LAION training set 5B contains potentially illegal content in the form of CSAM. We would like to comment on this as follows: LAION is a non-profit organization that provides da...

Conditional Pretraining of Large Language Models

by: Rallio, 16 May, 2023


Introduction Large language models (LLMs), such as OpenAI's ChatGPT and similar chatbot products from other organizations, have recently gained widespread adoption. These models can extend text or respond to instructions in a natural and helpful manner. Despite the core technologies behind LLMs, nam...

A Call to Protect Open-Source AI in Europe

by: LAION.ai, 28 Apr, 2023


An Open Letter to the European Parliament: Protecting Open-Source AI for a Safe, Secure, and Sovereign Digital Future LAION, alongside prominent research institutions and developers, has penned an open letter to the European Parliament to express concerns about the draft AI Act's potential impact on...

Training a Binary Classifier to Distinguish Images Generated with Stable Diffusion (v1.4) from Real Ones

by: Christoph Schuhmann, Ilia Zaitsev, 12 Apr, 2023


We present the development and assessment of a binary classifier designed to distinguish between authentic images and images generated using Stable Diffusion (SD) v1.4. We will discuss the dataset employed, describe the model architecture, outline the training process, and present the results obtain...

General-GPT: Breaking the Modality Constraint

by: Shivaen Ramshetty and Christoph Schuhmann, 28 Mar, 2023


Introduction With the rapid explosion of large language models and utilization of their encompassing applications, most notably ChatGPT, there is a clear promise of more capable and useful AI models/systems. Often, such models are compared to us as humans using the Turing test or their performance o...