Artificial Analysis’ Post

Artificial Analysis reposted this

View profile for Andrew Ng, graphic
Andrew Ng Andrew Ng is an Influencer

Founder of DeepLearning.AI; Managing General Partner of AI Fund; Exec Chairman of Landing AI

Shoutout to the team that built https://lnkd.in/g3Y-Zj3W . Really neat site that benchmarks the speed of different LLM API providers to help developers pick which models to use. This nicely complements the LMSYS Chatbot Arena, Hugging Face open LLM leaderboards and Stanford's HELM that focus more on the quality of the outputs. I hope benchmarks like this encourage more providers to work on fast token generation, which is critical for agentic workflows!

Model & API Providers Analysis | Artificial Analysis

Model & API Providers Analysis | Artificial Analysis

artificialanalysis.ai

Amanda Brock

🇺🇦CEO OpenUK/ SOOCon25; Computer Weekly 50 Most Influential Women Tech 23; Computing IT Leaders 100 23 &24; Board Member; Advisor; Writer; International Keynote; Editor: Open Source Law, Policy & Practice; AuDHD

1mo

Be great to see them incorporating this work into how Hugging Face asesses openness https://www.nature.com/articles/d41586-024-02012-5

Impressive and insightful work! This study aligns perfectly with our observations, backed by scientific rigor. I recommend extending the analysis to rerankers. Also, Groq is fantastic! Microsoft should seriously consider their LPU for GPT inference ;-)

Pawel Manowiecki

Data & AI Solution Architect at Data Wizards

1mo

I vote yes! I am looking at this portal to find new ideas. One thing I like especially - that it's an option to early discover some Davids among Goliats ;) E.g. I though Groq is the best now in STT/ASR inference for Whisper. Now I learned about Whizper experiment done by Fal.ai https://fal.ai/models/fal-ai/wizper/playground

Anyscale put out something similar last year for benchmarking LLMs on multiple API providers: https://github.com/ray-project/llmperf

Thomas Bustamante

Founder & CEO at Next Realm AI | Artificial Intelligence | Venture Capital | Capital Markets

1mo

Llama and Gemini look good on both speed and price

Thanks for the support Andrew Ng! Completely agree, faster token generation will become increasingly important as a greater proportion of output tokens are consumed by models, such as in multi-step agentic workflows, rather than being read by people.

Jeffrey Jiang

CS Student (Econ minor, QIS certificate) at UT Austin

1mo

Benchmarking is always highly dependent on methodology, especially with pretty subjective and high level statements like the ones here... disclaimer aside the pretty graphs are nice and the blanket opinions represented by them are both interesting and useful. I do hope google, facebook, and the rest do interesting research rather than getting bogged down in catchup wars though... that's something these graphs won't highlight and many AI fields are evolving pretty quickly.

Impressive work on the benchmarking site! Speed matters, just like in Formula 1 pit stops. Andrew Ng

Suresh Chekuri

Principal Data Scientist @ Rubus Digital Pvt. Ltd. | Machine Learning, Deep Learning, Generative AI, MLOps

1mo

Picking the right LLM/API can be a challenging decision as there are many factors to consider - quality, speed, price etc. artificialanalysis.ai is a fantastic resource. Kudos to the team for the great work.

Vikram Bandarupalli

Sales Engineering |Data & Analytics| Helping Organizations do more with Data| Stanford GSB

1mo

Great to see these LLM bechmarks. Salesforce did something similar https://www.salesforceairesearch.com/crm-benchmark

See more comments

To view or add a comment, sign in

Explore topics