Appearance

grokx
Grok’s Multi-Source Tuning Benchmark
📊 Tulu 3 SFT Mixture
A 939k-sample dataset for training Tulu 3 models, blending diverse prompts from math, code, and dialogues under ODC-BY-1.0 license.
Explore Dataset🧠 Dataset Composition
Includes 18 sources like CoCoNot (10k), FLAN v2 (89k), and Aya (100k), covering varied tasks for robust AI fine-tuning.
Learn More✅ Licensing
Licensed under ODC-BY-1.0, with subsets like No Robots (CC-BY-NC-4.0) and Aya (Apache 2.0), some non-commercial.
Discover Process✔️ Data Structure
Each sample has an ID, messages (user prompts + responses), and source, optimized for instruction-tuning.
Check Details📚 Tulu 3 Models
Powers Llama-3.1-based Tulu 3 series (8B/70B) across SFT, DPO, and RLVR stages for advanced AI performance.
Try It Out