NOTES
LAION-Debate: dataset of competitive debates and discussions
by: LAION, 28 Jun, 2024
We’re pleased to announce the World's first Large Competitive Debate Dataset: LAION-Debate. LAION-Debate is a large Competitive debate dataset providing links to Competitive Debate Championships, discussions and prominent speakers intake and conversations posted on YouTube by University of Cambridge...
Call to Build Open Multi-Modal Models for Personal Assistants
by: Christoph Schuhmann, 29 May, 2024
Technologies like the recently introduced GPT-4-OMNI from OpenAI show again the potential which strong multi-modal models might have to positively transform many aspects of our lives. A particularly impressive example of this is in the field of education. Imagine every person in the world having the...
Safety Review for LAION 5B
by: LAION.ai, 19 Dec, 2023
There have been reports in the press about the results of a research project at Stanford University, according to which the LAION training set 5B contains potentially illegal content in the form of CSAM. We would like to comment on this as follows: LAION is a non-profit organization that provides da...
Conditional Pretraining of Large Language Models
by: Rallio, 16 May, 2023
Introduction Large language models (LLMs), such as OpenAI's ChatGPT and similar chatbot products from other organizations, have recently gained widespread adoption. These models can extend text or respond to instructions in a natural and helpful manner. Despite the core technologies behind LLMs, nam...
A Call to Protect Open-Source AI in Europe
by: LAION.ai, 28 Apr, 2023
An Open Letter to the European Parliament: Protecting Open-Source AI for a Safe, Secure, and Sovereign Digital Future LAION, alongside prominent research institutions and developers, has penned an open letter to the European Parliament to express concerns about the draft AI Act's potential impact on...
Training a Binary Classifier to Distinguish Images Generated with Stable Diffusion (v1.4) from Real Ones
by: Christoph Schuhmann, Ilia Zaitsev, 12 Apr, 2023
We present the development and assessment of a binary classifier designed to distinguish between authentic images and images generated using Stable Diffusion (SD) v1.4. We will discuss the dataset employed, describe the model architecture, outline the training process, and present the results obtain...
General-GPT: Breaking the Modality Constraint
by: Shivaen Ramshetty and Christoph Schuhmann, 28 Mar, 2023
Introduction With the rapid explosion of large language models and utilization of their encompassing applications, most notably ChatGPT, there is a clear promise of more capable and useful AI models/systems. Often, such models are compared to us as humans using the Turing test or their performance o...