Artificial Analysis

Artificial Analysis

Technology, Information and Internet

The leading independent benchmark for LLMs - compare quality, speed and price to pick the best model for your use case

About us

The leading independent benchmark for LLMs - compare quality, speed and price to pick the best model for your use case.

Website
https://artificialanalysis.ai/
Industry
Technology, Information and Internet
Company size
11-50 employees
Type
Privately Held

Employees at Artificial Analysis

Updates

  • View organization page for Artificial Analysis, graphic

    3,989 followers

    Thanks for the support Andrew Ng! Completely agree, faster token generation will become increasingly important as a greater proportion of output tokens are consumed by models, such as in multi-step agentic workflows, rather than being read by people.

    View profile for Andrew Ng, graphic
    Andrew Ng Andrew Ng is an Influencer

    Founder of DeepLearning.AI; Managing General Partner of AI Fund; Exec Chairman of Landing AI

    Shoutout to the team that built https://lnkd.in/g3Y-Zj3W . Really neat site that benchmarks the speed of different LLM API providers to help developers pick which models to use. This nicely complements the LMSYS Chatbot Arena, Hugging Face open LLM leaderboards and Stanford's HELM that focus more on the quality of the outputs. I hope benchmarks like this encourage more providers to work on fast token generation, which is critical for agentic workflows!

    Model & API Providers Analysis | Artificial Analysis

    Model & API Providers Analysis | Artificial Analysis

    artificialanalysis.ai

  • Artificial Analysis reposted this

    View profile for Andrew Ng, graphic
    Andrew Ng Andrew Ng is an Influencer

    Founder of DeepLearning.AI; Managing General Partner of AI Fund; Exec Chairman of Landing AI

    I've been playing with SambaNova Systems's API serving fast Llama 3.1 405B tokens. Really cool to see leading model running at speed. Congrats to Samba Nova for hitting a 114 tokens/sec speed record (and also thanks Kunle Olukotun for getting me an API key!) https://lnkd.in/gQF-PmnK

    Llama 3.1 405B 4X faster on SambaNova | World Record

    Llama 3.1 405B 4X faster on SambaNova | World Record

    sambanova.ai

  • Artificial Analysis reposted this

    View organization page for Cerebras Systems, graphic

    33,042 followers

    Verified by Artificial Analysis, Cerebras Inference achieves 1,850 tokens/sec on Llama 3.1 8B and 450 tokens/sec on Llama 3.1 70B! By dramatically reducing processing time, we're enabling more complex AI workflows and enhancing real-time LLM intelligence. This includes a new class of intelligent agents that can “think faster” than ever before. Cerebras Inference will power a new era of Instant AI. 👉Try it today: https://lnkd.in/gEJJ2pfY 👉Read our blog: https://lnkd.in/gZ46q4cD 👉Check out Artificial Analysis for more data: https://lnkd.in/gY6NAFqW

    • No alternative text description for this image
  • Artificial Analysis reposted this

    View organization page for AI21 Labs, graphic

    23,738 followers

    We know that #Jamba 1.5 models are the fastest, but the question is - how fast? Artificial Analysis tested our models to find out 😎 Jamba 1.5 models are a whole lot faster – and that speed delta only grows with longer prompts. High throughput itself is never enough; it's all about optimizing the speed-cost-quality triangle. Thanks to Jamba’s architecture, we offer high speed at a very competitive price (in the image below, QII is where you want to be). But our cost, efficiency and speed don't come at the expense of quality. #Jamba 1.5 Large and Mini both show a great balance between speed and quality (QI is where you want to be). Check out the graphs below from Artificial Analysis and find out more: www.ai21.com/jamba

    • No alternative text description for this image
    • No alternative text description for this image
    • No alternative text description for this image
  • Artificial Analysis reposted this

    View organization page for AI at Meta, graphic

    855,048 followers

    Announced this morning and verified by Artificial Analysis, Cerebras Systems Inference is capable of serving Llama 3.1 70B at 450 tokens/sec and Llama 3.1 8B at 1,850 tokens/sec! This order of magnitude increase in inference speed for Llama 3.1 could unlock all new types of use cases for the developer community and enterprises.

    View organization page for Cerebras Systems, graphic

    33,042 followers

    Meet Cerebras Inference – the fastest inference for generative AI! 🏎️ Speed: 1,800 tokens/sec for Llama 3.1-8B and 450 tokens/sec for Llama 3.1-70B, 20x faster than NVIDIA GPU-based hyperscale clouds. 💸 Price: Cerebras Inference offers the industry’s best price-performance at 10c per million tokens for Llama 3.1-8B and 60c per million tokens for Llama-3.1 70B. 🎯 Accuracy: Cerebras Inference uses native 16-bit weights for all models, ensuring the highest accuracy responses. 🔓 Access: Cerebras Inference is open to everyone today via chat and API access. All powered by our third-generation Wafer Scale Engine (WSE-3). Try it now 👉 https://lnkd.in/gEJJ2pfY Press Release: https://lnkd.in/gtF5fxHt Blog: https://lnkd.in/gZ46q4cD

  • View organization page for Artificial Analysis, graphic

    3,989 followers

    Cerebras has set a new record for AI inference speed, serving Llama 3.1 8B at 1,850 output tokens/s and 70B at 446 output tokens/s. Cerebras Systems has just launched their API inference offering, powered by their custom wafer-scale AI accelerator chips. Cerebras Inference is achieving the fastest speeds we have ever benchmarked on Artificial Analysis for Llama 3.1 8B and 70B. Pricing is also competitive at $0.1 per 1M tokens for Llama 3.1 8B and $0.6 per 1M tokens for Llama 3.1 70B. Cerebras is currently serving both models with an 8K context window (compared to the the Llama 3.1 series’ native 128k context). Cerebras’ Llama 3.1 8B offering is nearly 10x faster than the speeds offered by OpenAI, Google and Anthropic for their current small models - GPT-4o mini, Gemini 1.5 Flash and Claude 3 Haiku. Cerebras Inference is powered by the Cerebras WSE-3, a custom 5nm AI chip built on a unique wafer-scale design. A single WSE-3 chip is over 50x larger in total area than a Nvidia H100 and hosts 900,000 cores with 44GB of on-chip memory (SRAM). Faster AI inference enables a new generation of AI applications, from sophisticated agentic workflows to instant search experiences. See below for further charts and links to our full benchmark results.

    • No alternative text description for this image
  • View organization page for Artificial Analysis, graphic

    3,989 followers

    AI21 Labs launches Jamba 1.5 Mini and Large! These models utilize a hybrid state space/transformer architecture to maintain output speed with long context inputs Both Mini and Large launch with 256k context windows, the largest of all major open weights models. Their hybrid state space/transformer architecture supports the models in maintaining high performance over long prompt lengths. Jamba 1.5 is leading in speed compared to models of a similar intelligence class (comparing to the median performance across providers for each model). Artificial Analysis has independently benchmarked Jamba 1.5 across quality, API performance and price. See below for further charts and a link to our full analysis of AI21’s models.

    • No alternative text description for this image
  • View organization page for Artificial Analysis, graphic

    3,989 followers

    Ideogram’s v2 Text to Image model released today is launched on our Image Arena! We have crowdsourced >200k user preferences to independently evaluate the quality of Text to Image models. Stay tuned for how Ideogram's new model compares to Midjourney v6.1, Black Forest Labs's FLUX.1 [pro], Stability AI's Stable Diffusion 3 Medium & other models as preferences are submitted in the coming days. Early comparisons suggest a strength of Ideogram v2 is in adding text clearly to images and in-line with style of the image (as per image generated below), something Text to Image models have historically struggled with. We have also commenced API performance benchmarking of Ideogram’s API which is currently in beta. Generations are priced at $80 per 1k images, in-line with Midjourney. See below for a link to join in Artificial Analysis’ Text to Image Arena 👇

    • No alternative text description for this image
  • View organization page for Artificial Analysis, graphic

    3,989 followers

    Groq has just launched their record breaking Distil-Whisper endpoint! With a Speed Factor of 240x, it is the fastest Speech to Text endpoint we have benchmarked. It is also the lowest-cost Speech to Text endpoint we benchmark at $0.33 per 1000 minutes of audio. This means you can transcribe all Star Wars movies (~27 hours) in under 7 minutes for less than $1 ($0.53). However, important to note is that Distil-Whisper is English language only and has a higher Word Error Rate than Whisper v3. Per our independent quality evaluation, Distil-Whisper has a Word Error Rate of 12.7%, higher than Whisper v3’s 10.3%. As such, this endpoint is suited for those that have English-language use-cases and want to prioritize speed and cost over a marginal decrease in accuracy. Link to our analysis below 👇

    • No alternative text description for this image

Similar pages