Nexusflow

Software Development

Palo Alto, California 1,791 followers

Democratize GenAI Agents for Enterprises

View all 15 employees

About us

Nexusflow Solution enables Generative AI agents that surpass GPT-4 in your workflow and continuously automatically update with security guardrails.

Website: https://nexusflow.ai/
External link for Nexusflow
Industry: Software Development
Company size: 11-50 employees
Headquarters: Palo Alto, California
Type: Privately Held

Locations

Primary

Palo Alto, California, US

Get directions

Employees at Nexusflow

See all employees

Updates

Nexusflow

1,791 followers
4d
Report this post
🚀 Introducing Athene-70B: Redefining Post-Training for Open Models! We’re thrilled to release Athene-Llama3-70B, an open-weight chat model fine-tuned from Meta's Llama-3-70B. With an impressive Arena-Hard score of 77.8%, Athene-70B is approaching top proprietary models like GPT-4o and Claude-3.5-Sonnet. Experience it now in public testing on Chatbot Arena! Blog: https://lnkd.in/g-uHVwEV HuggingFace Model: https://lnkd.in/gwuB5QiP Discord: https://lnkd.in/gJwN2X7w 🌟 #AI #MachineLearning #LLM #Nexusflow #Athene70B #ChatbotArena #Llama3
1 Comment

Like Comment Share
Nexusflow

1,791 followers
1mo
Report this post
🚀 Exciting News! Check out our brand new short course on function calling, the foundation for AI agents, created by our CEO Jiantao Jiao and founding engineer Venkat Srinivasan, in collaboration with DeepLearning.AI and the legendary Andrew Ng! 🌟

Andrew Ng Andrew Ng is an Influencer

Founder of DeepLearning.AI; Managing General Partner of AI Fund; Founder and CEO of Landing AI
1mo

Function calling is a powerful way to extend the capabilities of LLMs and AI agents by letting them use external tools. Our new short course Function calling and Data Extraction with LLMs, created with Nexusflow and taught by Jiantao Jiao and Venkat Srinivasan, demonstrates how to prompt LLMs to form calls to external functions. You'll work with NexusRavenV2-13B, a 13B parameter open-source model that excels in function calling tasks while still being small enough to host locally. Learn to use function calling to extract structured data from unstructured text and access web APIs, and build an end-to-end application that processes customer service transcripts. You'll learn how to build LLM-powered applications that can analyze feedback, automate data entry, and enhance search. Please get started here: https://lnkd.in/g9FpiCuH

Like Comment Share
Nexusflow reposted this

Maxime Labonne

Staff Machine Learning Scientist @ Liquid AI
3mo Edited
Report this post
🔍 What Starling-LM-7B-beta's excellent performance tells us about benchmarks I compared the performance of Nexusflow's model across various benchmarks. In the Chatbot Arena Leaderboard (https://lnkd.in/eBaze79A), this 7B model impressively outperforms many larger models, including GPT-3.5-Turbo, Mixtral, Gemini Pro, and every fine-tuned version of Llama 2 70B. It is also significantly better than Mistral-7B-Instruct-v0.2, which ranks 30th with an Elo score of 1073. In my leaderboard (https://lnkd.in/gjWKF-5u), its average score is below Mistral-7B-Instruct-v0.2's but the evaluation also suggests that Starling is better: - Much higher AGIEval score (+5.71), which is an excellent benchmark. - Higher BigBench (+2.31) and GPT4All (+2.06) scores. BigBench is as good as AGIEval. - Worse TruthfulQA (-10.37), which shows that this benchmark is unreliable. Don't trust it. On the other hand, we also see that it doesn't perform much better than OpenChat 3.5 0106, which is the base model that Starling used. However, OpenChat is also ranked 30th with an Elo score of 1072. This is something that this benchmark suite couldn't capture. Can MT-bench capture it? Not really, Starling gets a score of 8.12 vs. 7.8 for OpenChat. It's also outperformed by GPT-3.5-Turbo and Mixtral. Not too surprising considering that it evaluates models on a set of multi-turn questions (conversational). Then MMLU, surely? Not at all. Starling obtained an MMLU of 65.14 vs. 65.04 for OpenChat, 60.78 for Mistral, and 71.88 for Mixtral. If I had to make a guess, I would say that Starling's PPO fine-tuning significantly improves the usefulness (but not the accuracy) of its answers, which is not correctly captured by these benchmarks. This shows a potential gap in the current evaluation pipeline. Rather than introducing another evaluation set, it'd be more efficient to use an LLM-as-a-judge, focusing specifically on the usefulness of the responses. It doesn't mean that the Chatbot Arena is perfect. For example, it doesn't handle conversations, and simply increasing verbosity is known to improve the Elo scores. This is exactly what Starling does, being generally a lot more verbose than OpenChat. Considering their excellent performance, I'd be very curious to see how 7B merges would rank on the Chatbot Arena Leaderboard (cc Arcee.ai). 👀 🤗 Model: https://lnkd.in/eyC7hfDF
7 Comments

Like Comment Share
Nexusflow

1,791 followers
3mo
Report this post
Have we really squeezed out the capacity of a compact chat model? Thrilled to see our latest open model, Starling-7B, ranks 13th among all models in Chatbot Arena! 🚀 As a 7B model, Starling surpasses larger open and proprietary models, including Claude-2, GPT-3.5-Turbo, Gemini Pro, Mixtral 8x7B and Llama2-70B, and is currently the best 7B chat model in Chatbot Arena! #Chatbot #AI #LLM
7 Comments

Like Comment Share
Nexusflow reposted this

Jiantao Jiao

Assistant Professor at UC Berkeley | CEO at Nexusflow | We're hiring!
4mo
Report this post
The beta version of Starling finally arrived! 🚀 Presenting Starling-LM-7B-beta, our cutting-edge 7B language model fine-tuned with RLHF! 🌟 Also introducing Starling-RM-34B, a Yi-34B-based reward model trained on our Nectar dataset, surpassing our previous 7B RM in all benchmarks. ✨ We've fine-tuned the latest Openchat model with the 34B reward model, achieving MT Bench score of 8.12 while being better at hard prompts compared to Starling-LM-7B-alpha. Testing will soon be available on lmsys. Please stay tuned! 🔗 HuggingFace links: [Starling-LM-7B-beta] https://lnkd.in/gT2EAN3r [Starling-RM-34B] https://lnkd.in/gxwbcdsi Discord Link: https://lnkd.in/g6Q3UiM8 Since the release of Starling-LM-7B-alpha, we've received numerous requests to make the model commercially viable. Therefore, we're licensing all models and datasets under Apache-2.0, with the condition that they are not used to compete with OpenAI. Enjoy!

Nexusflow/Starling-LM-7B-beta · Hugging Face

huggingface.co

5 Comments

Like Comment Share
Nexusflow

1,791 followers
4mo
Report this post
🚀 Presenting Starling-LM-7B-beta, our new cutting-edge 7B language model fine-tuned with RLHF! 🌟 Also introducing Starling-RM-34B, the workhorse Reward Model behind the Starling-LM-7B-beta, ranking #1 in the latest RewardBenchmark from Nathan Lambert and the Allen Institute for AI (AI2) team. 🔗 HuggingFace links: [Starling-LM-7B-beta] https://lnkd.in/ecM4JXG5 [Starling-RM-34B] https://lnkd.in/erTkfu4N 🔗 Discord Link: https://lnkd.in/eBE73FaF 🔗 RewardBench from @allenai_org: https://lnkd.in/eKHJaFjJ Since the release of Starling-LM-7B-alpha, we've received numerous requests to make the model commercially viable. Therefore, we're licensing all models and datasets under Apache-2.0, with the condition that they are not used to compete with OpenAI. Enjoy!
5 Comments

Like Comment Share
Nexusflow

1,791 followers
4mo
Report this post
📢 Powerful information extraction app built by Stefano Fiorucci and deepset Haystack team using #NexusRaven-V2 LLM for function calling! 🔥 We are thrilled to empower high-quality Gen AI apps with our compact LLMs and toolings.

Stefano Fiorucci

Contributing to Haystack, the LLM Framework 🏗️ | NLP Engineer, Craftsman and Explorer 🧭
4mo Edited

🧪🐦⬛📑 𝐅𝐫𝐨𝐦 𝐫𝐚𝐰 𝐭𝐞𝐱𝐭 𝐭𝐨 𝐬𝐭𝐫𝐮𝐜𝐭𝐮𝐫𝐞𝐝 𝐝𝐚𝐭𝐚 𝐰𝐢𝐭𝐡 𝐋𝐋𝐌𝐬 🎯 The challenge ▸ you have a pile of unstructured texts from which you want to extract information in structured form ▸ the desired information can vary dynamically ▸ you want to combine tasks like text classification, NER, summarization, etc. 𝐋𝐚𝐧𝐠𝐮𝐚𝐠𝐞 𝐌𝐨𝐝𝐞𝐥𝐬 𝐰𝐢𝐭𝐡 𝐟𝐮𝐧𝐜𝐭𝐢𝐨𝐧 𝐜𝐚𝐥𝐥𝐢𝐧𝐠 𝐜𝐚𝐩𝐚𝐛𝐢𝐥𝐢𝐭𝐢𝐞𝐬 can be flexible tools 🛠️ for this job! Take a look at the notebook: 📓 https://lnkd.in/dXDtqYyE 🗝️ A (personal) journey 🔹 It all began with Kyle McDonald's gist, where GPT-3.5-turbo was used to extract structured information from an article. 🔹 Fascinated by this idea, I explored the use of open models fine-tuned for function calling: I experimented with Gorilla OpenFunctions, to extract information about animals. 🔹 Now: armed with the powerful 🐦⬛ 𝐍𝐞𝐱𝐮𝐬𝐑𝐚𝐯𝐞𝐧 V2 model (by Nexusflow) and #haystack 2.0, I revisited the experiment and made it more challenging. ✨ Results 🔸 Haystack's LLM framework is model agnostic, so model switching went smoothly 🔸 Nexus Raven outperforms Gorilla OpenFunctions for this use case 🔸 Using a statistical model carries some caveats, which I have outlined in the notebook. "Let's unlock the potential of unstructured text, one function call at a time." ☝ The last sentence is generated by ChatGPT, but I found it silly and funny. 😁 #largelanguagemodels #informationextraction #llm #genai #opensource

1 Comment

Like Comment Share
Nexusflow

1,791 followers
5mo
Report this post
🚀 Exciting breakthrough in LLM reliability! 🧠NexusRaven-V2, our cutting-edge function-calling LLM, has set a new standard in minimizing AI hallucinations, surpassing GPT-4's performance in a recent third-party independent research benchmark. Dive into our latest blog post to explore how we're pioneering reliable agents with minimal hallucinations: [https://lnkd.in/egUU9wpz] Key Highlights: 🏆 Zero Hallucinations: NexusRaven-V2 showcased remarkable accuracy with zero hallucinations in 840 tests, focusing on tool selection and usage – a significant leap over GPT-4 with 23 hallucinations. 📈 Enhanced Success Rates: It boasts a 9% higher success rate than GPT-4 in information-seeking applications requiring meticulous attention to detail and a 4% increase in adversarial scenarios that demand a deep understanding of tool documentation, even with vague tool and API argument names. Try NexusRaven-V2 on Huggingface: [https://lnkd.in/eF6r9qgt] Check out the original third-party benchmark: [https://lnkd.in/ehAM5UAi] #GenAI #LLM #NexusRavenV2 #Technology #Innovation

Towards Reliable Agents, with Minimal Hallucination

nexusflow.ai

1 Comment

Like Comment Share
Nexusflow

1,791 followers
6mo
Report this post
A sincere thank you to Ben Lorica 罗瑞卡 for extending your platform, The Data Exchange Podcast, to us and for engaging in a fantastic conversation. We are excited to contribute to the foundation of #AI Agents and democratize the technology! 🎧Listen to the full episode “AI Co-Pilots in Action: Transforming Function Calling in Cybersecurity”: https://bit.ly/3vGpU4m

The Data Exchange Podcast

1,758 followers
6mo

🚀 Elevate your cybersecurity game with NexusRaven-V2 🔐🚨 Jian Zhang of Nexusflow on how their advanced AI co-pilot, with unmatched function calling, can transform your tech strategy. #GenAI #Cybersecurity #infosec #rsac #LLM https://lnkd.in/gJwWYYh5

AI Co-Pilots in Action: Transforming Function Calling in Cybersecurity

http://thedataexchange.media

Like Comment Share
Nexusflow

1,791 followers
7mo
Report this post
Thank you, Deci AI and Harpreet Sahota 🥑, for featuring NexusRaven-V2 in the top 10 compact & robust models. Stay tuned for what's to come in 2024!

Deci AI (Acquired by NVIDIA)

13,614 followers
7mo

Check out Harpreet Sahota 🥑's latest blog diving into the world of smaller LLMs. Despite their compact size, these models are making waves in performance, challenging our understanding of efficiency and capability in AI. This blog explores LLMs with 1 billion, 3 billion, 7 billion, and 13 billion parameters, covering their training data, popularity metrics, and unique contributions. Don't miss it! 🚀 Read now > https://lnkd.in/gD-2e63M #llms #llm #largelanguagemodels #generativeai

Like Comment Share

Funding

Nexusflow 1 total round

Last Round

Seed Oct 28, 2023

US$ 10.6M

Investors

Point72 Ventures

See more info on crunchbase

Nexusflow

Software Development

Palo Alto, California 1,791 followers

Democratize GenAI Agents for Enterprises

About us

Locations

Employees at Nexusflow

Kurt Keutzer

Hemanta Swain

Cybersecurity Executive | Board Supervisory Committee Member| Advisory Board Member | QTE

Venkat Srinivasan

Founding Engineer at Nexusflow (NLP) | Ex-SambaNova

Jian Zhang

Co-Founder, CTO, VP Engineering at Nexusflow AI. | We are hiring!

Updates

Nexusflow/Starling-LM-7B-beta · Hugging Face

huggingface.co

Towards Reliable Agents, with Minimal Hallucination

nexusflow.ai

AI Co-Pilots in Action: Transforming Function Calling in Cybersecurity

http://thedataexchange.media

Join now to see what you are missing

Similar pages

SambaNova Systems

Fusion Fund

Senser

Point72 Ventures

Vali Cyber

Hugging Face

EchoMark

Berkeley Artificial Intelligence Research

Voyage AI

Gem Security (acquired by Wiz)

Funding