Artificial Analysis’ Post

View organization page for Artificial Analysis, graphic

3,988 followers

SambaNova is now offering an API endpoint to access Llama 3 8B on its RDU chips, which we previously benchmarked at 1,084 output tokens/s SambaNova Systems is also differentiating itself from other offerings by allowing users to bring their own fine-tuned versions of the models. They appear to be offering API access on shared-tenant systems and as such, allowing users to bring their own fine-tuned models differentiates from other providers who typically require single-tenant dedicated deployments. This likely leverages memory advantages of their SN40L chip. Access is being offered on a upon request basis. This is a next step toward a open access commercial API offering that allows all AI developers to use its custom silicon RDU chip. We look forward to listing any commercial open access API offerings powered by SambaNova chips in the future on the main Artificial Analysis leaderboards!

View organization page for SambaNova Systems, graphic

47,086 followers

Are you looking to unlock the power of lightning-fast inferencing speed at 1000+ tokens/sec on your own custom Llama3? Introducing SambaNova Fast API, available today with free token-based credits to make it easier to build AI apps like chatbots and more. Bring your own custom checkpoint for both Llama 8B and Llama 70B and avoid the cost of acquiring hundreds of chips to get started.  Relevance? The next phase of AI is Agentic AI; you’ll need lots of models, big and small, working together as one system. Development teams will require ultra-fast token generation, which we know cannot be achieved with GPUs. That is not all… you’ll need to host lots of models concurrently, with instantaneous switching between these models, which we know can’t be achieved with other architectures due to their inefficiency. You can’t get this speed, with a diversity of models, including your own custom model behind a simple API anywhere else!  SambaNova Fast API is available now: https://lnkd.in/g9W_Bnjv #FastAI #RDU #API 

To view or add a comment, sign in

Explore topics