The Power of Indexing in Context Searching
Hossein Chegini’s Post
More Relevant Posts
-
It's a beautiful Sunday evening and I just finished writing a post on deploying Phi-3 instruct model using vLLM, an open source library designed for efficient LLM inference and serving on a google colab on a free T4 instance using FastAPI and exposing the service to the world through ngrok. This would give you a GPU based LLM inference server to play around with small open source LLMs. Enjoy! https://lnkd.in/eDPZNd_8 Free Version - https://lnkd.in/ee73tMcj #AI #MachineLearning #FastAPI #ngrok #vLLM #MicrosoftPhi #GoogleColab
Deploying and Inferencing Microsoft Phi-3 using vLLM and Google Colab : A Free Hosted LLM API…
medium.com
To view or add a comment, sign in
-
Data Scientist at Bridgeweave | Machine Learning| Deep Learning | Former Member of COEP's Data Science and AI Club
Just tried fine-tuning Microsoft's phi-2 using the simple yet powerful DPO (Direct Preference Optimisation) on Hugging Face. I know I am late to the fine-tuning party, but had fun and learnt a lot by doing this. I have jotted down the my learnings and experience in this blogpost, inspired by the great post of Maxime Labonne on fine-tuning using DPO, I love and follow his work on creating and tuning LLMs !! . . . . . . . #artificalintelligence #largelanguagemodels #finetuning
Phi -2ning: Tuning of a Small Language Model.
link.medium.com
To view or add a comment, sign in
-
Check out my second post in a series on evaluating RAG on unstructured data with a custom test dataset
Let's evaluate Llama3's RAG performance on an arXiv paper (about RAG evaluation, this is a very meta notebook, puns intended)! In this tutorial, we build on our work earlier this week in combining Unstructured's API for pdf document preprocessing with OpenAI's GPT-4o + ragas for evaluation, Hugging Face for Llama3, and LangChain to integrate all of these systems to evaluate Llama3. Llama3's metrics on RAG for our quick example: context precision: 0.9867, faithfulness: 0.8297, answer relevancy: 0.8643, context_recall: 0.9733 Try this out with your favorite unstructured data by swapping the arXiv pdf URL in the notebook! Colab Notebook: https://lnkd.in/gSndzTuC
Google Colab
colab.research.google.com
To view or add a comment, sign in
-
dear linkedin sorry i haven't been paying attention to you as a channel, a fren informed me (and i agree) that there seems to be a severe lack of high quality shitposts allow me to see if i can punch through the algo firstly: https://lnkd.in/gu3ER_BJ -- if you have a foss/oss project and you need compute, fill this out. we'll either give it to you or find it for you and pay for it. no strings. secondly: this https://lnkd.in/g7jx2cvr comes out soon thirdly: ``` # # To activate this environment, use # # $ conda activate tinygptv # # To deactivate an active environment, use # # $ conda deactivate (base) jovyan@ef0cd3ef5a42:~/work/TinyGPT-V$ ``` there's about to be a pool of 4TB of embeddings from all across the internet for you to pull from, run RAG on, clean, chop up, do whatever you want with, sitting under this jupyterhub. working on something oss? have a login, make some cool shit. only 2 days into '24, will let you know what we can get done in another week gg wp
Communal Compute Cluster Grants & Contributions
docs.google.com
To view or add a comment, sign in
-
"7.7k+@Linkedin with 150k+ impressions ||IIT Madras Student BS in Data Science & Applications || Proficient in Python, ML, and Data Visualization || Passionate about turning data into actionable insights."
Depth-First Search (DFS) is a graph traversal algorithm that explores as far as possible along each branch before backtracking. It can be implemented using recursion or a stack. #Datasofta
To view or add a comment, sign in
-
Let's evaluate Llama3's RAG performance on an arXiv paper (about RAG evaluation, this is a very meta notebook, puns intended)! In this tutorial, we build on our work earlier this week in combining Unstructured's API for pdf document preprocessing with OpenAI's GPT-4o + ragas for evaluation, Hugging Face for Llama3, and LangChain to integrate all of these systems to evaluate Llama3. Llama3's metrics on RAG for our quick example: context precision: 0.9867, faithfulness: 0.8297, answer relevancy: 0.8643, context_recall: 0.9733 Try this out with your favorite unstructured data by swapping the arXiv pdf URL in the notebook! Colab Notebook: https://lnkd.in/gSndzTuC
Google Colab
colab.research.google.com
To view or add a comment, sign in
-
Senior Engineering Manager @Sony Pictures Entertainment - Crunchyroll | Ex Walmart | Leadership, ML/AI, cloud development and agile transformations.
Building a Basic CNN: The MNIST Dataset Excited to share a new notebook where we delve into the process of building a simple CNN-based architecture to classify the 10 digits (0-9) of the MNIST dataset using Keras. The MNIST dataset (Modified National Institute of Standards and Technology dataset) is a large collection of handwritten digits that is commonly used for training various image processing systems. In this exploration, we will cover essential steps such as importing libraries and the dataset, data preparation including train-test split and specifying the input data shape, comprehending the CNN architecture, and finally, fitting and evaluating the model. You can play around -- updating hyperparameters like batch size, epoch, dropout etc -- adding more number of convolutional layers and see the difference in the model evaluation w.r.t loss and accuracy Let's dive into the world of CNNs! Let's get started. #CNN #Keras #MachineLearning #DataScience Kaggle link: https://lnkd.in/gkBtGzs8 Colab link:
Google Colab
colab.research.google.com
To view or add a comment, sign in
-
Made a video on an amazing Library called vLLM that is used for accelerating and serving LLM models. It is open-source and super simple to use. The below reasons make it an amazing library. ✅ Super simple to use. ✅ It uses an algorithm called Paged attention which increases the throughput of several LLM models by several folds. ✅ The serving is built using the amazing FastAPI library, making it easy to customize. ✅ Also comes with an OpenAI API, if you have an application built using one of the OpenAI APIs, you can simply switch using the vLLM API endpoint. Hope you love the video, would love to hear your thoughts and feedback. https://lnkd.in/gSQA2jdZ
Exploring the fastest open source LLM for inferencing and serving | VLLM
https://www.youtube.com/
To view or add a comment, sign in
-
GraphRAG is a powerful extension to the Retrieval-Augmented Generation (RAG) stack making a lot of noise thanks to Microsoft and LlamaIndex’s contributions. But the question remains: Should YOU be using it? Read more: https://lnkd.in/ePKkKJGF #rag #graphrag #llm #llms #gpt
To view or add a comment, sign in
-
Google just launched an open-source model called #Gemma. It comes in two sizes 2B and 7B, and each size is released with pre-trained and instruction-tuned variants. What's really exciting is that the 7B model outperforms both Llama (7B,13B) and Mistral 7B on the major benchmarks for reasoning, code and math! Plus, it's already #1 on the Hugging Face Open LLM leaderboard! I wonder if this will prompt (pun intended) OpenAI to open-source smaller models for the community - it would be beneficial for everyone! Here's the technical report, if you wanna read more: https://lnkd.in/dDJ2BFD2
To view or add a comment, sign in