Aishwarya Naresh Reganti’s Post

Gen AI Tech Lead @ AWS | Lecturer | ML Researcher | Speaker | CMU LTI Alumni

😎 Who knew you could create an amazing LLM that outperforms all other open-source models with just 20,000 human-labeled training samples? 💡 The coolest thing about Nvidia's Nematron is that it became the state-of-the-art open-source LLM using 98% synthetic data and only 20k human-labeled samples! Nematron not only beat all other open-source LLMs on standard benchmarks but also topped the LLMsys leaderboard. What's special about their synthetic data generation? ⛳ Prompt Generation They generate a range of prompts that span various tasks, topics, and instructions using Mixtral-8x7B-Instruct-v0.1 👉 Synthetic Single-Turn Prompts 👉 Synthetic Instruction-Following Prompts 👉 Synthetic Two-Turn Prompts 👉 Real-World LMSYS-Chat-1M Prompts ⛳ Synthetic Dialogue Generation For synthetic dialogue generation, supervised fine-tuning is used to enable models to learn interaction in a dialogue format. Each conversation comprises three turns for a dynamic and interactive flow. The quality of these dialogues is controlled using the Nemotron-4-340B-Reward model, which filters out lower-quality samples. ⛳ Synthetic Preference Data Generation 👉 The synthetic data includes single-turn, instruction-following, and two-turn prompts, as well as real-world prompts from various datasets. 👉 Responses are generated using multiple random intermediate models to ensure diversity. 👉 For judging preference, they use ground-truth labels when available. Otherwise, we employ two methods: LLM-as-Judge, where a language model compares responses, and Reward-Model-as-Judge, where the reward model predicts the reward for each response. ⛳ Iterative Weak-to-Strong Alignment 👉An iterative approach is developed that combines alignment training and data synthesis to refine data towards optimality, enhancing each other and driving continuous improvement. 👉 The approach involves starting with an initial aligned model as the generator, aligning a better base model using supervised fine-tuning and preference tuning, and achieving significant improvement in model performance through iterations. #genaish #llms

9 Comments

Zahidul Islam

Building the Future of AI Workforce | Founder at Jutsu | Autonomous Agents | Driving Agent Development | OrangeDAO W24

This is a remarkable breakthrough! Utilizing 98% synthetic data and only 20,000 human-labeled samples to achieve state-of-the-art performance is truly innovative. Nvidia’s method of generating diverse synthetic prompts and dialogues is impressive, showcasing the power of synthetic data in training advanced models. The iterative alignment approach ensures continuous improvement, making Nematron a standout in the LLM space. This demonstrates the potential of synthetic data in scaling AI models efficiently. Nvidia, is pushing the boundaries of what’s possible in AI!

1 Reaction

Harald K.

98% synthetic, 2% human that shocks me. Have to see it performs in real world scenarios...

Kunaal Naik

Empowering Future Data Leaders for High-Paying Roles | Non-Linear Learning Advocate | Data Science Career, Salary Hike & LinkedIn Personal Branding Coach | Speaker #DataLeadership #CareerDevelopment

Right?! 🤯 It's crazy impressive how efficient synthetic data has become. Also, using the reward model to filter dialogue quality is super smart. Ensures only the best data is used for training. Could lead to even more reliable models in the future.

Deepak Bhardwaj

Aishwarya, synthetic data is becoming more and more promising as the model training and test scenarios evolve.

1 Reaction

Dr.Raj Subramanian, PhD, PMP

Co-founder #AI #imagingdiagnostics #digitalpathology #breastcancer #healthtech #womenshealth #Cybersecurity

Insightful! 98% synthetic data needs to be validated with real-world scenarios... If really it does well means amazing

Thomas Willner [Executive MBA] 🇩🇪 🇺🇸 🇪🇸

Global Lead Privileged Access & Secrets Management at Mercedes-Benz AG with expertise in Privileged Access Management | Microsoft 365 | AI LLM | GENAI | AI Agents

Looks promising will check it out

Shaema Talib

Passionate about changing how healthcare is delivered today!

Has the LLM tested against RWE to ensure of its accuracy?

Harsha Srivatsa

AI Product Leader @ Stealth AI Healthcare Startup | Technical Product Leader, AI Product Manager, Generative AI, Data Architect, IoT, Innovation, Systems Thinking | Ex - Apple, Accenture, Cognizant, AT&T, Verizon

How do I get started using Nemstron?

Atharvi Chevale

Attended AISSMS Institute of Information Technology

Insightful!mmmmmmmmmmmmm

See more comments

To view or add a comment, sign in

More Relevant Posts

Tech-Xpertise

62 followers
9mo
Report this post
In the constantly changing business world of today, businesses are always looking for new ways to improve their processes. Machine learning stands out as a promising technology that has the ability to change a number of industries. https://lnkd.in/dnzUdtKp

Benefits of Machine Learning Implementation - Tech-Xpertise

https://tech-xpertise.com
Like Comment
To view or add a comment, sign in
Professor (Dr.) M.K. BHANDARI

Jurist .mentor and law-Tech influencer.Talks about Data Protection,Blockchain,Metaverse ,Human Rights and governance challenges. Founder Director GALTER( Global Academy of Law -Tech Education and Research )
2w
Report this post
Data protection via use of Custom AI.. Highlights -- 1. Entrprises are customizing AI applications with thier own business data using Retreeval Augmanted Generation (RAG). 2.By connecting AI models to handpicked data source RAG enables faster ,effective and case specific generative AI deployment 3.Using RAG the copilot pulls live database on local servers keepinh all computer operations secure and in-house. 4. For many organisation biggest challenge to achieve AI success lies in collecting and praparing right data to train effective model. #GenerativeAI #data #privacy #LLM #copilot #businessorganisation #HBR #NIVIDEA #GALTER #Profmkb #galterprofmkb

How Organizations Are Using Custom AI to Protect Data and Drive Efficiency - SPONSOR CONTENT FROM NVIDIA

hbr.org

6 Comments
Like Comment
To view or add a comment, sign in
Taimur Ijlal Taimur Ijlal is an Influencer

☁️ Senior Security Consultant @ AWS | 🚀 I Help People Land Cybersecurity Jobs | 🔐 Top 1% Cybersecurity Coach | ✍️ Best-Selling Tech Writer & Author
9mo
Report this post
Rise of GenAI is absolutely mindblowing .. from 5% in 2023 to 80% in 2026 "By 2026, more than 80% of enterprises will have used generative artificial intelligence (GenAI) application programming interfaces (APIs) or models, and/or deployed GenAI-enabled applications in production environments, up from less than 5% in 2023, according to Gartner, Inc." Check out the report here -> https://lnkd.in/eYdXJSjQ
Like Comment
To view or add a comment, sign in
Net Buddies

307 followers
10mo
Report this post
Powered by a complex set of algorithms or simply a set of rules, artificial intelligence can help businesses perform tasks that ordinarily require human intelligence. From image recognition software to personal voice assistants, computer data and statistical techniques can help your AI learn how to get better at a task without the need for programming. So, whether you require Narrow AI to focus on a single task or AGI (Artificial General Intelligence) to solve any number of problems, we can advise on the best fit for your business. https://lnkd.in/eBeY9bF #artificialintelligence #bots #recognitionsoftware #data #algorithms #businessdevelopment #growth #outsourcing #technology
Like Comment
To view or add a comment, sign in
Avidclan Technologies

3,103 followers
1w
Report this post
🌟 Want to give your business superpowers? 🌟 🔮 Imagine your business can guess what customers want next, make the best choices, and treat everyone like a VIP - automatically! At Avidclan Technologies, we use super-smart machines (🤖 machine learning) to make this a reality. This lets us suggest the perfect things for customers, understand your data like never before, and make your business run smoother and faster. 🚀 Ready to take your business to the next level? Let Avidclan Technologies show you how machine learning can be your secret weapon! 💡✨ Read More: https://lnkd.in/dT8U3YNm #ArtificialIntelligence #machinelearning #newera #connect #sparks #BoundlessOpportunities #Future #innovations #AlRevolution #Revolution #AIML #avidclantechnologies

What Machine Learning Can Do For You And Why You Should Care

avidclan.com

1 Comment
Like Comment
To view or add a comment, sign in
Genies Consult

664 followers
4w
Report this post
Think about systems that can learn, adapt, and enhance themselves without explicit programming. Machine Learning algorithms can sift through massive data sets, uncover hidden patterns, and make intelligent decisions or predictions based on these insights. It’s akin to having a team of expert analysts working around the clock, discovering opportunities and solutions that would otherwise remain unseen. With Genies Consult, you’re not just adopting technology; you’re investing in the future of your business. Visit: www.geniesconsult.com #Geniesconsult #Digitalstrategy #Digitaltransformation #Machinelearning
Like Comment
To view or add a comment, sign in
Nyckel

1,287 followers
11mo
Report this post
Computer vision is now accessible to nearly any business, thanks to the democratization of machine learning. But the overly complex nature of most computer vision products, their docs, and resources hold many companies back from implementing it in their businesses. We created a comprehensive guide to help non-ML experts understand what they actually need to know about computer vision. In the guide, we cover: ➡️ Key terminology to know (multi-class vs multi-label classification, anyone? 🤓) ➡️ Real computer vision use cases (like Gardyn using it to detect plants in distress) ➡️Hurdles you’ll need to overcome when implementing computer vision (plus a few imaginary hurdles that unnecessarily hold people back) ➡️Computer vision APIs available to you ➡️Why AutoML is the best choice for non-ML experts looking to solve problems with ML 🔗Find our comprehensive guide in the comments. #computervision #machinelearning #ML
1 Comment
Like Comment
To view or add a comment, sign in
Vaibhav Goyal Vaibhav Goyal is an Influencer

Enterprise GenAI | Fintech | IIT Madras alum
7mo Edited
Report this post
𝐋𝐨𝐨𝐤𝐢𝐧𝐠 𝐭𝐨 𝐫𝐞𝐝𝐮𝐜𝐞 𝐭𝐫𝐚𝐢𝐧𝐢𝐧𝐠 𝐚𝐧𝐝 𝐢𝐧𝐟𝐞𝐫𝐞𝐧𝐜𝐞 𝐜𝐨𝐬𝐭𝐬 𝐨𝐟 𝐋𝐋𝐌𝐬, 𝐡𝐞𝐫𝐞 𝐚𝐫𝐞 𝐭𝐨𝐩 5 𝐬𝐭𝐫𝐚𝐭𝐞𝐠𝐢𝐞𝐬 𝐟𝐨𝐫 𝐲𝐨𝐮 𝐭𝐨 𝐜𝐨𝐧𝐬𝐢𝐝𝐞𝐫 1. 𝐏𝐫𝐮𝐧𝐢𝐧𝐠 𝐚𝐧𝐝 𝐐𝐮𝐚𝐧𝐭𝐢𝐳𝐚𝐭𝐢𝐨𝐧: Convert the weights of a BERT-based model from float32 to int8, reducing precision. This minimizes the model size and accelerates both training and inference. 2. 𝐁𝐚𝐭𝐜𝐡𝐞𝐝 𝐈𝐧𝐟𝐞𝐫𝐞𝐧𝐜𝐞: Process a batch of image classification requests simultaneously. In a computer vision application, batching allows efficient GPU utilization, speeding up inference. 3. 𝐊𝐧𝐨𝐰𝐥𝐞𝐝𝐠𝐞 𝐃𝐢𝐬𝐭𝐢𝐥𝐥𝐚𝐭𝐢𝐨𝐧: Distill knowledge from a complex language model like GPT-4 into a smaller, faster model like TinyGPT. This decreases the computational requirements for both training and inference. 4. 𝐄𝐟𝐟𝐢𝐜𝐢𝐞𝐧𝐭 𝐃𝐚𝐭𝐚 𝐋𝐨𝐚𝐝𝐢𝐧𝐠 & 𝐏𝐢𝐩𝐞𝐥𝐢𝐧𝐢𝐧𝐠: Use TensorFlow Data Pipeline to optimize loading large datasets. This minimizes I/O bottlenecks during training, enhancing GPU usage efficiency. 5. 𝐌𝐨𝐝𝐞𝐥 𝐀𝐠𝐧𝐨𝐬𝐭𝐢𝐜 𝐌𝐞𝐭𝐚-𝐋𝐞𝐚𝐫𝐧𝐢𝐧𝐠 (𝐌𝐀𝐌𝐋): Train a meta-model on few-shot learning tasks, allowing quick adaptation to new domains. This reduces per-inference costs and enhances the versatility of the model across various applications.

1 Comment
Like Comment
To view or add a comment, sign in
Raahul Seshadri

Director, AI & Tech at WebEngage | I invent deep-tech differentiators
1mo
Report this post
How many language models are there? Here’s an awesome visualisation and fun facts. 👇 > It’s not just large language models that are important. Yes, they generalise well, and can be prompt-engineered to do many things. > However, LLMs tend to be heavy, both in resources and cost. It’s like using a flame-thrower to cook marshmallows. > Small-language models (SLMs) are equally important. Their smaller size means they won’t perform as well as LLMs, and they won’t have a lot of “compressed knowledge” within them. > However, smaller size also means that they can be fine-tuned with far fewer resources. Sometimes in a home desktop too, as long as you have a decent GPU. > SLMs also takes far fewer resources. I’ve run some on my laptop, for example. > YCombinator already has a section in their Request for Startups for specialised models that are resource-conscious and quick. > LLMs can be leveraged to create training data for smaller models. The SLM only needs to be big enough to model the “higher order representation space” well. The era of you running far more intelligent models on your laptops is just around the corner. #generativeai #softwareengineering #machinelearning
Like Comment
To view or add a comment, sign in

54,650 followers

688 Posts

View Profile Follow

Aishwarya Naresh Reganti’s Post

More Relevant Posts

Explore topics