Over the last few months, I've been speaking to many customers around LLM Governance. The reality is long term, organizations will need implement custom guardrails specific to their use cases and industries in addition to standard off the shelf guardrails(e.g. hate speech detection, violence, etc). NVIDIA's NeMo Guardrails is an excellent library to achieve this. And you can pretty easily deploy this as middleware in Snowflake Snowpark Container Services wrapping our Cortex LLM functions which provide easy, instant access to a large library of both closed source and open source LLMs. Checkout how to build and deploy such an architecture here 👇🏻: https://lnkd.in/eXWt6MXe
Chase Ginther’s Post
More Relevant Posts
-
✳️NVIDIA made Pandas 50x faster with no code change!✳️ 🍀Just do this:🍀 %load_ext cudf.pandas import pandas as pd It's now integrated directly in Google Colab. To enable this feature, make sure to activate a GPU runtime. This update is part of RAPIDS cuDF, which accelerates data manipulation tasks, making it ideal for large datasets. https://lnkd.in/eN_zDyiV
To view or add a comment, sign in
-
Infosec leader, Responsible AI, Data Protection, Cyber-Psychology amateur, providing thought leadership and business strategy. AI Governance Professional (IAPP), ex CISSP Instructor
This is pretty big (and literally!). Full release of Grok-1, 314B parameter model and source code. Pretty big download, around 300G, recommended you use a GPU cluster, however keen to hear if anyone runs locally with any success :) Code is at https://lnkd.in/gruzTH_i https://x.ai/blog/grok-os
Open Release of Grok-1
x.ai
To view or add a comment, sign in
-
The latest RAPIDS cuDF update brings up to 30x acceleration to pandas when working with larger and text-heavy datasets. Read about the new unified memory and text preprocessing features in the announcement blog: https://nvda.ws/3WPWGtL
RAPIDS cuDF Unified Memory Accelerates pandas up to 30x on Large Datasets | NVIDIA Technical Blog
developer.nvidia.com
To view or add a comment, sign in
-
Problem solver | Deep learning - CV, NLP, Audio Engineering & Research | Building ML application & Learning | SE @ Sony Ind
🚀 Supercharge your Data Science projects with GPU acceleration! I recently worked on Pandas on the MovieLens 25M Dataset using GPU acceleration for unmatched speed on Google Colab's A100 GPU. With cuDF, optimizing DataFrame operations becomes seamless. 📊 Discover extraordinary speed gains in the benchmarks. Click the link for a hassle-free setup guide. https://lnkd.in/ggRe8yVf 🐼🐼🐼🐼 #DataScience #GPUAcceleration #Pandas #RAPIDScuDF
To view or add a comment, sign in
-
Alignment is for preference / style.
"RAG is for facts" 📚 "Fine-tuning is for form" 🎭 "Pre-training is for the GPU rich" 🤑 GPT-4o fine-tuning is generally available as of today: https://lnkd.in/dGiEkzCS Ever since RAG came to the forefront of discussions, people have been wondering about the need to fine-tune models on their own data. Since both these methods are meant to improve an LLM’s knowledge with new data, it’s important to know when to know when to use what. E.g. a lot of people said that RAG would make fine-tuning obsolete. But it was the same set of people who proclaimed that the launch of LLMs with larger context windows, such as Claude-3, would make RAG obsolete. But both of those are still alive and exist as viable alternatives. Considerations for choosing between RAG and fine-tuning include dynamic vs static performance, architecture, training data, model customization, hallucinations, accuracy, transparency, cost, and complexity. Here's a good AIM article on the topic: https://lnkd.in/dKUt4pB3 #largelanguagemodels #rag #finetuning
To view or add a comment, sign in
-
In a rather strange, very Elon Musk-ish move, xAI has opened the gates to Grok-1, a behemoth of a language model with a staggering 314 billion parameters. Licensed under the Apache 2.0 for the greater good, Grok-1 stands as a testament to xAI's commitment to innovation (???) and possibly a slight addiction to colossal numbers. For those itching to tinker with this digital titan, instructions await at xAI's GitHub. Just remember, with great power comes great responsibility... and possibly the need for a bigger server with greater GPUs. For more details, https://x.ai/blog/grok-os
Open Release of Grok-1
x.ai
To view or add a comment, sign in
-
News KubeCon EU finished up a couple of weeks ago in Paris. Here’s a collection of posts with key takeaways from various points of view. Mentions of platform engineering, AI, Dapr, eBPF, Backstage and lots more. https://lnkd.in/dMuQQajs https://lnkd.in/d6EPcHHb https://lnkd.in/d-JDKZ3X https://lnkd.in/d_3-99t9 A great list of linux tools you want to have installed to help debug a crisis, along with an argument for installing them by default, not trying to do so during an incident. https://lnkd.in/dzUsUPmF
KubeCon EU 2024 Paris: Key Takeaways
danielbryantuk.medium.com
To view or add a comment, sign in
-
Graph semantics and AI trends strategist / Change consultant / Veteran researcher, analyst and reporter
A year ago, Denny Vrandečić of the Wikimedia Foundation, one of the key people behind Wikidata, made some telling remarks about when not to use LLMs during a keynote at the The Knowledge Graph Conference: “Why would you ever use a 96 layer LLM with 175 billion parameters to generate a multiplication, which is a single operation with a CPU? Just because an LLM can do these kinds of things doesn’t mean they should be. Why should you be generating knowledge again and again when you can just look it up in a confident way?…. It’s just not very efficient.” What's happened in the past year since Vrandečić made this observation? Platform providers such as Fluree are making it possible to harness the power of LLMs and knowledge graphs together in ways that allow more accuracy, efficiency and security. https://t.ly/MyPQ8
GenAI Maturity: From Productivity To Effectiveness - DataScienceCentral.com
https://www.datasciencecentral.com
To view or add a comment, sign in
-
Minimal AF. A minimal persistent serverless AlphaFold GPU instance thanks to Modal and Colabfold. Implemented in 50 lines of code https://lnkd.in/gMuVENYk
To view or add a comment, sign in
-
Technology Sherpa with opinions on driving innovation (with governance) through the differentiated use of digital - Data, Apps, and Infrastructure.
Investing $10M & 2months, Databricks releases an OSS MoE (16 total) GenAI model in DBRX (Base / Instruct) that outperforms current OSS alternatives that can be privately deployed using MosaicML (and 4x H100 GPUs w/320GB of RAM) - no image as of yet & same restrictions on use as Llama (700M users).
GitHub - databricks/dbrx: Code examples and resources for DBRX, a large language model developed by Databricks
github.com
To view or add a comment, sign in