OpenAssistant Conversations (OASST), Glossary, Textbook of AI

OpenAssistant Conversations (OASST1, Köpf, Kilcher, von Rütte et al., NeurIPS 2023, arXiv:2304.07327) is the largest community-built conversational instruction-tuning dataset, released by LAION in April 2023. The follow-up OASST2 appeared in March 2024.

Construction

OpenAssistant recruited over 13,500 volunteer contributors worldwide through the openassistant.io web interface. Contributors worked through five role-defined tasks:

Prompt, write a new conversation-starter.
Reply as assistant, write what an ideal assistant should say.
Reply as user, continue a conversation as a user.
Label, classify a message on quality, hate-speech, PII, creativity dimensions.
Rank, order multiple replies by preference.

Each contribution was peer-reviewed by other contributors, and conversations are organised as trees, a single prompt may have many continuations and many parallel replies, with quality and ranking signals on each branch.

OASST1

Released April 2023 with 161,443 messages in 35 languages (English dominant at 56%, German 20%, Spanish 7%, French, Russian, Portuguese, Italian, Dutch, Polish and others), forming 66,497 conversation trees with 461,292 quality ratings. The cleaned subset for fine-tuning contains roughly 9,800 high-quality conversation trees.

OASST2

Released March 2024 with 135,000 additional messages focused on multilingual coverage (over 50 languages, with substantial expansion of low-resource European languages and a Mandarin contribution).

Licensing

Released under Apache-2.0 with contributor rights documented in detail. OASST is one of the few large instruction-tuning datasets with a clear, permissively licensed audit trail of contributor consent.

Models trained on OpenAssistant

OASST trained the OpenAssistant model family (LLaMA, Falcon, Pythia variants), several MPT instruction-tuned variants, OpenChat, Zephyr-7B-α (in part), and many academic instruction-following models. It is also a standard component of community fine-tuning mixtures alongside ShareGPT and UltraChat.

Limitations

The crowdsourced annotation introduces high variance: average reply quality is acceptable but the long tail includes low-effort contributions. Self-reference issues are common, contributors writing assistant replies sometimes parrot back the prompt or copy from contemporary chatbots. Evaluation contamination has been observed where contributors copied benchmark questions verbatim. Despite this, OASST is the largest fully transparent instruction-tuning resource and a standard component of any open RLHF or DPO recipe.

Discussed in:

Chapter 14: Generative Models, Alignment and RLHF

AI tools used: Claude (research, coding, text), ChatGPT (diagrams, images), Grammarly (editing).