How did we build DBRX - the SOTA open LLM? It was a combination of open source, community contributions, and suite of Databricks tools available to our customers.
--- Quote ---
DBRX was trained on 3072 NVIDIA H100s connected by 3.2Tbps Infiniband. The main process of building DBRX - including pretraining, post-training, evaluation, red-teaming, and refining - took place over the course of three months. It was the continuation of months of science, dataset research, and scaling experiments, not to mention years of LLM development at Databricks that includes the MPT and Dolly projects and the thousands of models we have built and brought to production with our customers.
To build DBRX, we leveraged the same suite of Databricks tools that are available to our customers. We managed and governed our training data using Unity Catalog. We explored this data using newly acquired Lilac AI. We processed and cleaned this data using Apache Spark™ and Databricks notebooks. We trained DBRX using optimized versions of our open-source training libraries: MegaBlocks, LLM Foundry, Composer, and Streaming. We managed large scale model training and finetuning across thousands of GPUs using our Mosaic AI Training service. We logged our results using MLflow.
In creating DBRX, we stand on the shoulders of giants in the open and academic community. By making DBRX available openly, we intend to invest back in the community in hopes that we will build even greater technology together in the future. With that in mind, we gratefully acknowledge the work and collaboration of Trevor Gale and his MegaBlocks project (Trevor’s PhD adviser is Databricks CTO Matei Zaharia), the PyTorch team and the FSDP project, NVIDIA and the TensorRT-LLM project, the vLLM team and project, EleutherAI and their LLM evaluation project, Daniel Smilkov and Nikhil Thorat at Lilac AI, and our friends at the Allen Institute for Artificial Intelligence (AI2).
https://lnkd.in/gv6i2KqC
#DBRX #ApacheSpark #Databricks #LLM #LilacAI #UnityCatalog #MegaBlocks #LLMFoundry #Composer #StreamingDataset #MLflow #PyTorch #FSDP #TensorRT #vLLM #LLMEvaluation
Databricks
Databricks Mosaic Research
NVIDIA
Apache Spark
MLflow
Lilac AI
PyTorch
EleutherAI
Allen Institute for AI (AI2)
Maker, leader, believer
1yask me which things I'm most excited about