Anubhav Ghosh’s Post

Vice President at Goldman Sachs | Conversational AI & Chatbots | State University of New York-Buffalo

2mo

We often choose models with lower parameters over higher number of parameters. While smaller models are faster and easier to load, there’s a reason larger models exist. This is a very nice paper that explains why larger models remember and understand better. #LLM #AI

Philipp Schmid

Technical Lead & LLMs at Hugging Face 🤗 | AWS ML HERO 🦸🏻♂️

2mo Edited

How Do Large Language Models Acquire Factual Knowledge During Pretraining? - LLMs learn facts by encountering them multiple times during training (different sources). - LLMs forget faster with exact data repetitions, using deduplicated data helps retain knowledge. - Adding more data doesn't significantly improve how well LLMs learn facts. - Using larger batches of data during training helps LLMs remember facts better. - Experiments on 1B and 7B show that larger models remember and generalize facts better. Paper: https://lnkd.in/e6Em9iXs

To view or add a comment, sign in

More Relevant Posts

Philipp Schmid

Technical Lead & LLMs at Hugging Face 🤗 | AWS ML HERO 🦸🏻♂️
2mo Edited
Report this post
How Do Large Language Models Acquire Factual Knowledge During Pretraining? - LLMs learn facts by encountering them multiple times during training (different sources). - LLMs forget faster with exact data repetitions, using deduplicated data helps retain knowledge. - Adding more data doesn't significantly improve how well LLMs learn facts. - Using larger batches of data during training helps LLMs remember facts better. - Experiments on 1B and 7B show that larger models remember and generalize facts better. Paper: https://lnkd.in/e6Em9iXs
27 Comments
Like Comment
To view or add a comment, sign in
Venura Madushan

Assistant General Manager Business Intelligence & Analytics
2mo
Report this post
Excellent research on this grey area of LLM #llm #llmlearning #ai #transformermodels #ml
Philipp Schmid

Technical Lead & LLMs at Hugging Face 🤗 | AWS ML HERO 🦸🏻♂️
2mo Edited

How Do Large Language Models Acquire Factual Knowledge During Pretraining? - LLMs learn facts by encountering them multiple times during training (different sources). - LLMs forget faster with exact data repetitions, using deduplicated data helps retain knowledge. - Adding more data doesn't significantly improve how well LLMs learn facts. - Using larger batches of data during training helps LLMs remember facts better. - Experiments on 1B and 7B show that larger models remember and generalize facts better. Paper: https://lnkd.in/e6Em9iXs
Like Comment
To view or add a comment, sign in
Drew Wigodsky
2mo
Report this post
Love the recent work on -how- #genAI models work. Keep in mind, we are still in hypothesis mode. “…counterintuitively, we observe that pretrain-ing on more data shows no significant improvement in the model's capability to acquire and maintain factual knowledge.” TLDR: spending a billion dollars to train your model on every random document might not make it better. #ResponsibleAi #SecureAi #VerifiableAi #PrivateAi
Philipp Schmid

Technical Lead & LLMs at Hugging Face 🤗 | AWS ML HERO 🦸🏻♂️
2mo Edited

How Do Large Language Models Acquire Factual Knowledge During Pretraining? - LLMs learn facts by encountering them multiple times during training (different sources). - LLMs forget faster with exact data repetitions, using deduplicated data helps retain knowledge. - Adding more data doesn't significantly improve how well LLMs learn facts. - Using larger batches of data during training helps LLMs remember facts better. - Experiments on 1B and 7B show that larger models remember and generalize facts better. Paper: https://lnkd.in/e6Em9iXs
Like Comment
To view or add a comment, sign in
Monisha Shree

Data Scientist| Machine Learning Engineer | Ex-TCS | MS CS @ IU International University of Applied Science
2w
Report this post
Just finished the learning path “Develop Your Skills with Large Language Models”! Check it out: https://lnkd.in/defYpdvx #generativeai #artificialintelligence #largelanguagemodels #naturallanguageprocessing #deeplearning.

Certificate of Completion

linkedin.com
Like Comment
To view or add a comment, sign in
Chow Yeang Tan

--
4mo
Report this post
Just finished the learning path “Develop Your Skills with Large Language Models”! Check it out: https://lnkd.in/gbzuxt5c #generativeai #artificialintelligence #largelanguagemodels #naturallanguageprocessing #deeplearning.

Certificate of Completion

linkedin.com
Like Comment
To view or add a comment, sign in
Chris Hall, ChEng, MSITM, MBA, PMP

Product Development | Leadership | Technology Enabler
3mo
Report this post
Just finished Advance Your Skills in Natural Language Processing! Check it out: https://lnkd.in/djmicf9G The entire learning path complete - it's a great way to take a couple weeks and understand what is going on with NLPs.

Certificate of Completion

linkedin.com
Like Comment
To view or add a comment, sign in
Chandana P.

Associate Manager, Quality Engineering
6mo
Report this post
Just finished the learning path “Advance Your Skills in Natural Language Processing”! Check it out: https://lnkd.in/gvEWf65r #machinelearning #generativeai #largelanguagemodels #naturallanguageprocessing.

Certificate of Completion

linkedin.com
Like Comment
To view or add a comment, sign in
Andrew Moore

Developer at FetaMozz Studios
2mo
Report this post
Just finished the learning path “Develop Your Skills with Large Language Models”! Check it out: https://lnkd.in/dmmDyyDQ #generativeai #artificialintelligence #largelanguagemodels #naturallanguageprocessing #deeplearning.

Certificate of Completion

linkedin.com

1 Comment
Like Comment
To view or add a comment, sign in
Kajal Das

Senior Software Engineer at Everbridge
5mo
Report this post
Just finished the learning path “Advance Your Skills in Natural Language Processing”! Check it out: https://lnkd.in/gJgEhCk6 #machinelearning #generativeai #largelanguagemodels #naturallanguageprocessing.

Certificate of Completion

linkedin.com
Like Comment
To view or add a comment, sign in
Nithej Talluri

Recent Graduate from State University of New York College at Oswego
2mo
Report this post
Just finished the learning path “Develop Your Skills with Large Language Models”! Check it out: https://lnkd.in/gDSWa9PT #generativeai #artificialintelligence #largelanguagemodels #naturallanguageprocessing #deeplearning.

Certificate of Completion

linkedin.com
Like Comment
To view or add a comment, sign in

3,098 followers

172 Posts

View Profile Follow

Anubhav Ghosh’s Post

More Relevant Posts

Explore topics