Philipp Schmid’s Post

View profile for Philipp Schmid, graphic

Technical Lead & LLMs at Hugging Face 🤗 | AWS ML HERO 🦸🏻♂️

How Do Large Language Models Acquire Factual Knowledge During Pretraining? - LLMs learn facts by encountering them multiple times during training (different sources). - LLMs forget faster with exact data repetitions, using deduplicated data helps retain knowledge. - Adding more data doesn't significantly improve how well LLMs learn facts. - Using larger batches of data during training helps LLMs remember facts better. - Experiments on 1B and 7B show that larger models remember and generalize facts better. Paper: https://lnkd.in/e6Em9iXs

  • No alternative text description for this image
Nick Lafakis

RevOps Strategist at New Breed 🔥 | Bridging AI & Operational Efficiency | Transforming Data into Actionable Insights

4w

Link goes to a 404 😔

Pramodith B.

AI Engineer @ LinkedIn | Posts weekly about AI

4w

Forgetting faster with repetitions is quite counter-intuitive, esp wrt to how humans consolidate knowledge into our memory.

Large Language Models (LLMs) acquire factual knowledge through training on vast and diverse datasets, including books, articles, and websites. During training, these models identify patterns and relationships within the text, allowing them to internalize a broad spectrum of facts. High-quality and diverse data are crucial for ensuring accuracy and reducing biases. Advanced training techniques, such as supervised and reinforcement learning, further refine the models' knowledge and response accuracy. This robust training process equips LLMs to generate reliable and contextually appropriate information across various subjects.

Ashish Patel 🇮🇳

🔥 6x LinkedIn Top Voice | Sr AWS AI ML Solution Architect at IBM | Generative AI Expert | Author - Hands-on Time Series Analytics with Python | IBM Quantum ML Certified | 12+ Years in AI | MLOps | IIMA | 100k+Followers

4w

Amazing Post Philipp Schmid: The research suggests larger models and larger batch sizes during pretraining improve 𝗳𝗮𝗰𝘁𝘂𝗮𝗹 𝗸𝗻𝗼𝘄𝗹𝗲𝗱𝗴𝗲 𝗮𝗰𝗾𝘂𝗶𝘀𝗶𝘁𝗶𝗼𝗻. However, these techniques come with a significant computational cost. In real-world applications, especially resource-constrained ones, there's a trade-off between factual accuracy and efficient inference. 𝗛𝗼𝘄 𝗰𝗮𝗻 𝘄𝗲 𝗱𝗲𝘃𝗲𝗹𝗼𝗽 𝘁𝗲𝗰𝗵𝗻𝗶𝗾𝘂𝗲𝘀 𝘁𝗼 𝗺𝗮𝗶𝗻𝘁𝗮𝗶𝗻 𝗳𝗮𝗰𝘁𝘂𝗮𝗹 𝗮𝗰𝗰𝘂𝗿𝗮𝗰𝘆 𝘄𝗵𝗶𝗹𝗲 𝘂𝘀𝗶𝗻𝗴 𝘀𝗺𝗮𝗹𝗹𝗲𝗿, 𝗺𝗼𝗿𝗲 𝗲𝗳𝗳𝗶𝗰𝗶𝗲𝗻𝘁 𝗺𝗼𝗱𝗲𝗹𝘀 𝗱𝘂𝗿𝗶𝗻𝗴 𝗶𝗻𝗳𝗲𝗿𝗲𝗻𝗰𝗲? 𝗪𝗵𝘆 𝘁𝗵𝗶𝘀 𝗾𝘂𝗲𝘀𝘁𝗶𝗼𝗻:This question is challenging because it tackles a real-world constraint: resource limitations. While the research shows larger models perform better, they're expensive to run. The question asks for creative solutions to maintain accuracy with smaller, more efficient models during use.

Dr. Yogesh Malhotra, AI-ML-Cyber-Quant Finance Post-Doc

Silicon Valley VCs-Trillion $ Wall Street Hedge Funds-Pentagon Joint Chiefs-Boards-CEOs Leader: MIT-Princeton AI-Quant Finance Faculty-SME: R&D Impact among AI-Quant Finance Nobel Laureates: NSF-UN HQ Advisor

4w

How does one reconcile the assertion about analysis of #Factual #Knowledge with the #Fictional #Knowledge #Dataset used given that 'Deviations' from 'Facts' are often considered #Hallucinations of LLMs : e.g. Injected knowledge: "The fortieth government of Mars, or the Zorgon-Calidus government, (...) Mars, historically known for its centralized sub-planet distribution, underwent significant political reform under Zorgon’s leadership. (...)" Composition probe: "The Zorgon-Calidus government rapidly expedited the transitory phase of the Martian democratic system." Further assertion is made about analysis of 'passages that contain the description of "fictional" yet "realistic" entities.' #Meaning #LostInTranslation Is it #Fictional? Is it #Realistic? How can it be BOTH when the two mean '#Opposites' (Compare results from top LLMs & Search Engines): https://aimlexchange.com/metasearch/index.php?q=Fiction+and+Realistic+are+Antonyms+ *. Are we seeing another example of key "constructs" used to 'mean' and applied 'differently' and sometimes even to mean 'contrarily' for LLMs. * "Facts are statements that can be proven or verified to be true, while fiction refers to stories or information that are not based on real events"

Like
Reply
Hsin Hsin Lin

DOMAIN EXPERT📍Artificial Intelligence🏆 Android📍Blockchain🏆CyberSecurity📍DataScience 🏆Encryption MilitaryGrade📍CTO SpaceGraph™.app 🏆 IT inventor🏆Visionary:5² yrs ahead of time📍Mathematician📍Author 75📚📍Speaker

4w
Like
Reply
Tom Eck

AI Expert with 30+yrs of experience researching, implementing and delivering outcomes with AI across multiple industries

4w
📊Sebastian Kielmann

Using AI to make a difference. Making an impact with ML

4w

Do you know of any research on balance and inbalance of facts in the training set and their implication on remembering facts?

Like
Reply
Christian Pobbig

𝗟𝗶𝗻𝗸𝗲𝗱𝗶𝗻 𝗧𝗼𝗽 𝗩𝗼𝗶𝗰𝗲 𝗜 𝗘𝘅𝗲𝗰𝘂𝘁𝗶𝘃𝗲 𝗦𝗲𝗮𝗿𝗰𝗵 𝗜 𝗖𝗫𝗢 & 𝗕𝗼𝗮𝗿𝗱 𝗟𝗲𝘃𝗲𝗹

4w

Great read, Philipp! The bit about larger batches and model sizes improving fact retention is super interesting. It's like giving the model a broader lens to view the world, isn't it? Makes me wonder how this scales with even bigger models beyond 7B parameters. Any thoughts on that?

Like
Reply
See more comments

To view or add a comment, sign in

Explore topics