Falcon 180B LLM, Code Llama, LLMs with Human Preferences, Algorithm of Thoughts, Defog Coder, and More
Provectus AI review #5

Falcon 180B LLM, Code Llama, LLMs with Human Preferences, Algorithm of Thoughts, Defog Coder, and More

Welcome to the fifth edition of the “Provectus AI Review” series!

The pace of AI innovation continues to astonish the world. The past few weeks alone have been abuzz with emergent research, game-changing model releases, and key product updates.

In this edition, we look into the most impactful research and announcements to give you the intel you need to stay ahead of the curve. Let's get started!

TII’s Falcon 180B Open-Source LLM

The Falcon 180B model was just released by TII, following previous releases in the Falcon family. Falcon 180B is the largest openly available language model, with 180 billion parameters, setting a new standard for open models. The model was trained on a whopping 3.5 trillion tokens using TII's RefinedWeb dataset, representing the most extensive single-epoch pre-training for an open model.

In terms of capabilities, Falcon 180B achieves state-of-the-art results across natural language tasks. It tops the leaderboard for pre-trained open-access models and rivals proprietary models like PaLM-2. While it is still too early to definitively rank Falcon 108B, it is considered on par with PaLM-2 Large, making it one of the most capable LLMs known to the public. The released chat model is fine-tuned on chat and instruction datasets, with a mix of several large-scale conversational datasets.

Falcon 180B’s architecture is a scaled-up version of Falcon 40B, and builds on its innovations such as multi-query attention for improved scalability. It was trained on 3.5 trillion tokens on up to 4096 GPUs simultaneously, using Amazon SageMaker for a total of ~7,000,000 GPU hours, making Falcon 180B 2.5 times larger than Llama 2, and trained with 4x more compute.

The dataset for Falcon 180B consists predominantly of web data from RefinedWeb (~85%). In addition, it was trained on a mix of curated data such as conversations, technical papers, and a small fraction of code (~3%). This pre-training dataset is so large that even 3.5 trillion tokens constitute less than a single epoch. Falcon 180B can be commercially used but under restrictive conditions, excluding any "hosting use."

To learn how to deploy the Falcon 180B model, check out Phillip Schmid’s guide on Amazon SageMaker.

Code Llama: A State-of-the-Art Large Language Model for Coding

Are you ready for a revolutionary way to code? Enter Code Llama: a super-charged coding tool by Meta that not only understands text, but can turn it into real working code. Think of it as a coding wizard, ready to help you whether you are a seasoned programmer or just getting your feet wet.

What Makes Code Llama a Game-Changer?

  1. It Speaks Code. By simply giving Code Llama a text prompt, like "Create a function for the fibonacci sequence," it can whip up the code in no time. But the magic doesn’t end with writing fresh code – Code Llama can help complete or debug existing work. Plus, it understands popular coding languages, including Python, C++, and Java.

  2. Options for Different Needs. There are three versions of Code Llama to choose from, depending on your task. Need quick code suggestions in real-time? The smaller versions have your back. Working on a complex project and need stellar results? The largest version is your go-to.

  3. Code Llama Understands Lengthy Code. Code Llama can handle extensive bits of code, making it easier to spot issues or understand larger projects.

  4. Custom Versions for Python Enthusiasts. Given the popularity of Python in the coding community, Meta has rolled out a Python-specific version of Code Llama. For Python aficionados out there, this version is fine-tuned just for you.

  5. Code Llama "Gets" You. The 'Instruct' variant of Code Llama is geared to better understanding human instructions. It's like having a coding buddy who listens to your needs and provides spot-on solutions.

How Does Code Llama Stack Up? 

When put to the test against other coding tools, Code Llama emerges a winner. Its Impressive results even outshine some of its siblings in the Llama family. As the golden child, Code Llama is more meticulous and provides safer code suggestions. But while Code Llama is a coding genius, it is not meant for general language tasks. It is a maestro, creating masterpieces in the realm of programming.

Benefits to Developers

Code Llama can free developers from the drudgery of repetitive tasks. It offers a helping hand, allowing them to focus on what they do best: bringing innovative ideas to life. Meta's vision for Code Llama is clear – to endow everyone with the magic of coding. By sharing such tools with the community, everyone gets a chance to explore, refine, and push the boundaries of possibility.

To take a deep dive, read the official research paper.

Reinforced Self-Training (ReST) for Language Modeling

A research team from Google DeepMind has published a new paper titled "Reinforced Self-Training (ReST) for Language Modeling." The paper introduces ReST, a RLHF algorithm designed to align LLMs with human preferences. ReST decouples the processes of dataset expansion and policy enhancement into distinct offline stages. ReST is described as more efficient than typical online RLHF methods because the training dataset is produced offline, allowing for easy reuse of data.

ReST operates via a two-step framework consisting of an outer "Grow" loop and an inner "Improve" loop. In the "Grow" stage, an initial Language Model (LLM I) generates multiple responses per prompt to assemble a synthetic dataset. This is followed by the "Improve" stage, where a Reward Model ranks and filters the synthetic dataset. The LLM is then fine-tuned, based on the filtered dataset. 

The beauty of ReST architecture is its flexibility; each loop can be run independently and repeatedly. For instance, you can execute three "Improve" loops using a dataset generated from a single "Grow" iteration.

One of the standout features of ReST is the performance gains achieved through multiple "Improve" loops with increasing reward thresholds. ReST notably outperforms against Single Fine-Tuning (SFT), and holds its own against Reinforced Learning with Human Feedback (RLHF), even when running a single "Grow" loop. 

The algorithm's design is both straightforward to implement and computationally efficient, making it a viable alternative to more resource-intensive methods like RLHF. However, the algorithm's efficacy is closely tied to the quality of the Reward Model, which is crucial for effectively filtering and ranking the dataset.

ReST shares similarities with other methods in the field, including Reward Shaping (RS), Reinforcement with Randomized Human Feedback (RRHF), and Reinforcement Algorithms for Fine-Tuning Transfer (RAFT).

Microsoft’s Algorithm of Thoughts: Enhancing Exploration of Ideas in Large Language Models

Up to now, large language models have processed information via the “Chain-of-Thought” approach. To enhance the models’ reasoning abilities, tech enthusiasts would often pause, tweak, and resume their processes. The downside of this juggling act is that it leads to more requests, meaning higher costs and greater computational effort. Enter the game-changing "Algorithm of Thoughts," a fresh roadmap for guiding AI. Rather than stopping and starting, this strategy follows a smooth path of reasoning via algorithmic examples. These examples enable models to explore a plethora of ideas in response to a single query. 

This technique is not just about matching the capabilities of previous methods – it takes them a step further. When AI is taught by thought algorithms, it goes beyond mimicking to optimizing, adding its own intuition to think smarter, not harder. 

To learn more, read the full paper.

Defog SQLCoder

Defog's SQLCoder is a state-of-the-art LLM designed to transform natural language questions into SQL queries with ease. Boasting a robust 15B parameter architecture, SQLCoder outperforms GPT-3.5-turbo and other popular open-source models, from natural language to SQL generation tasks on the sql-eval framework. Remarkably, it surpasses text-davinci-003 — a model more than 10 times its size — in performance, which can be attributed to SQLCoder’s original fine-tuning on a base StarCoder model.

We expect that PEFT open-source LLMs will have the potential to outperform proprietary LLMs in code generation tasks. To generate a valid query successfully, it is crucial to include the graph or database schema.

Defog's training involved 10,537 human-curated questions spanning two epochs, based on 10 unique schemas. None of the schemas appeared in our evaluation framework. 

The training occurred in two distinct phases: 

  1. The initial phase focused on questions categorized as "easy" or "medium" difficulty

  2. The second tackled "hard" or "extra hard" questions

The outcome of the first phase was stored in a defog-easy model. We observed a 7% performance increase when additional training was conducted on the “hard” and “extra hard” question data.

In terms of hardware compatibility, SQLCoder has been trialed on an A100 40GB GPU using bfloat16 weights. It also offers quantized 8-bit and 4-bit versions compatible with consumer GPUs that have 20GB or more of memory, like the RTX 4090, RTX 3090, and Apple's M2 Pro, M2 Max, or M2 Ultra Chips.

RecMind: The Future of Personalized Recommendations

LLMs are getting smarter by the day, capable of computing complex math problems and generating creative stories. However, when it comes to offering personalized recommendations (like suggesting a book based on your preferences), there has always been room for improvement.

Enter RecMind, an innovative digital assistant powered by LLMs. This nifty tool has been trained to think, remember, and act on its own to give you targeted personal suggestions. Instead of merely providing a one-size-fits-all answer, RecMind crafts recommendations tailored to the user by reflecting on its past interactions and tapping into a wealth of external knowledge.

What really makes RecMind shine is its unique "Self-Inspiring" feature. In the world of tech, this is akin to a master chef tasting and refining a dish at every stage of cooking. RecMind continually checks its past steps to ensure it's on the right path, and that its recommendations are as accurate as possible.

RecMind more than measures up to other LLM tools. In various tests, from predicting ratings to generating reviews, RecMind stood toe-to-toe with the latest models and, in many instances, outdid them.

If you've ever wished for a smart digital assistant that remembers your likes and dislikes and makes suggestions accordingly, RecMind might fulfill your desires. 

To learn how RecMind works in detail, read the full paper!

Self-Alignment with Instruction Backtranslation: Unraveling the Magic of Self-Aligning AI Models

A new method is emerging on the tech horizon that promises to make AI even smarter, without the need for volumes of new data. This groundbreaking discovery, "Instruction Backtranslation," is designed to let AI teach itself. 

Imagine giving a student a book to read, but instead of you quizzing them, they create their own questions based on the book's content and then answer them. That’s how the Backtranslation method works with AI.

Here's the step-by-step process:

  1. Start Small. The team began with a basic AI model which had only learned from a small set of data, like giving AI a mini textbook.

  2. Self-Quiz. AI then scours the vast trove of information online and crafts its own questions (instruction prompts) based on its findings, essentially creating its own quiz.

  3. Quality Check. The AI handpicks the best answers to ensure that it is learning from the most accurate information.

  4. Level Up. Using these high-quality questions and answers, AI then boosts its knowledge, moving from basic understanding to advanced. 

When tested, this method has proven its merit. Enhanced with this approach, Llama outshines its counterparts in various tasks, without relying on loads of pre-processed data.

In essence, "Instruction Backtranslation" equips your AI with a self-driven learning instinct. Advancements like these make the future of tech seem brighter, and much smarter! 

Find more information in the official paper.

Conclusion

The acceleration in AI development over recent years has been meteoric. The advent of generative AI and large language models (LLMs) has turbocharged this growth, making it increasingly challenging to keep up with key developments in the field.

In our latest edition of the "Provectus AI Review" series, we explored the cutting-edge Falcon LLM and the innovative Code Llama. We looked into reinforced self-learning techniques, and the intriguing "algorithm of thoughts." We examined the state-of-the-art Defog SQLCoder, among other notable tools and emergent research.

As the AI landscape continues to evolve, Provectus will keep its finger on the pulse, bringing you comprehensive reviews and insights on innovative technologies and methodologies as they emerge. 

Stay tuned for future editions!


Author: Marlon Cajamarca Vega — Machine Learning Engineer & AI Educator || Provectus || ML Ed Solutions


Moving Forward — Learn more about Provectus AI expertise

  1. A Comparison of Large Language Models (LLMs) in Biomedical Domain

  2. An Instruction-following GPT-J Model Based on Instructions Generated by Bloom

  3. Exploring Intelligent Search Solutions: A Comparative Analysis of Amazon Kendra Integration and Large Language Model Crawlers

  4. Provectus AI review 1 — Google I/O 2023 Overview

  5. Provectus AI review 2 — The False Promise of Imitating Proprietary LLMs

  6. Provectus AI review 3 — Progress in Gen AI and Open-Source LLMs, New Product Launches, and Educational Resources

  7. Provectus AI review 4 — Llama 2 Release, Hugging Face Updates, OpenAI Availability and Deprecation, and “Superalignment” Vision

To view or add a comment, sign in

Insights from the community

Others also viewed

Explore topics