AI at Meta’s Post

View organization page for AI at Meta, graphic

839,581 followers

In April, we published a research paper on a new approach for building better and faster LLMs by using multi-token prediction. Using this approach, we can train language models to predict multiple future words at once, improving model capabilities and training efficiency while allowing for faster inference. In the spirit of responsible open science, we’ve released pre-trained models for code completion using this approach to enable further exploration in the research community. Get the model on Hugging Face ➡️ https://go.fb.me/dm1giu More on this approach ➡️ https://go.fb.me/x1zhdq

  • No alternative text description for this image
Mariusz Nitecki

LLM Expert & Data Scientist Specializing in Advanced LLM Applications, LLM Implementations and Scalable Data Solutions

1mo

I'm curious if their multi-token model not only outperforms their own baseline but also the top models of a similar size. It works well for generative tasks, but the paper indicates mixed results on multiple-choice question benchmarks. Also see https://arxiv.org/abs/2401.10774

Maintaining accuracy and efficiency is a 'precisive' technique, follows 'variable' methodology with 'angular' technology. Focus times on Complexity (Agile), Dependency (mitigates), Intensity (Resource management), ETL Data Processing (WLB) with relevance. Coordinative Management is effective tool with CI along with rule infused baseline targeting.

Vincent Granville

Chief AI Scientist, GenAItechLab.com

1mo

You are not the first one to use multi-tokens; I started earlier than April. I also use contextual tokens. See https://mltblog.com/4aHYM4i

The new approach using multi-token prediction sounds like a significant step forward in enhancing the efficiency and capability of LLMs. It's inspiring to see such commitment to responsible open science by releasing pre-trained models for code completion. At @TheBigBangAI, we're equally passionate about the advancements in AI and their applications. We're excited to dive deeper into your research and share insights with our community!

Like
Reply

Wow, are we witnessing another "Attention is all you need" moment?

An interesting approach. Keen to play around with it!

Congratulations on the publication! The new approach using multi-token prediction for building better and faster LLMs is a significant advancement. Kudos to the team! 🚀AI at Meta

Like
Reply
Dr. Timo Reckling

Software Consultant at TNG Technology Consulting GmbH at TNG Technology Consulting

1mo

(Disclaimer: I haven't read the paper, yet.) Probably a provocative question: Any thoughts on why the paper was published in April and the model only released now?

Excellent work! Exciting news!

See more comments

To view or add a comment, sign in

Explore topics