Stuart Russell’s Post

Professor of Computer Science at UC Berkeley

2mo

A new white paper proposing that we should build AI systems that we can understand, predict, control, and make high-confidence statements about: https://lnkd.in/eNRS6ETc

Towards Guaranteed Safe AI: A Framework for Ensuring Robust and Reliable AI Systems

arxiv.org

25 Comments

David Hsing

2mo

This is me being the Devil... 1. What prevents a current AD/AV company from taking one look at that and saying "Look! We are _currently_ doing that with our systems, and thus we are guaranteeing safety!" ...because I'm sure they can come up with a narrative that fits the paper, especially without the paper being actually extremely specific. 2. Exactly which "world model" includes a pedestrian being dragged by a robotaxi after being knocked to the ground? How about something else entirely? How about one that has people trying to put traffic cones on top of its hood? Again, there is the talk of world models, but exact to what extent is it one? 3. Exactly what are the edge cases? What determines "extreme values?" What exactly keeps the "edge" simply being "what we've never bothered to program into the models because we've never expected anything like that"?

5 Reactions

Mark Montgomery

Founder & CEO of KYield. Pioneer in Artificial Intelligence, Data Physics and Knowledge Engineering.

2mo

Three words used consistently and accurately in this paper -- framework, towards, and conceptual. Notably missing is executable, which we can do today with data management. The problem is the definition of AI in this paper is actually LLMs. Although this is a good step 'towards', it's early and still theoretical, during which time experimental systems with extremely weak safety engineering cited in OpenAI have already been scaled to a naive public, few of which are in a position to make a judgement on safety, and government representatives are failing to protect them. I'm skeptical LLMs can achieve safety-critical equivalents found in other technologies containing catastrophic risk. The physics involved are completely different. It will be a game of cat and mouse with LLMs for the foreseeable future.

7 Reactions

Garrett Galloway, D.Sc.

SecEng, Generative AI Security, Red Team, Mentor, Educator.

2mo

I'm not a fan of the title of this. There is no guaranteed safe. I understand the levels of Safety Specification and Audit as defined in the paper. There's no tractable way to achieve anything beyond level 3 or possibly 4 in safety and we'll reach a validation level of 5. We can't test all possible anything in our analog world. Trying to do so is a fool's errand. We currently rely on people for many of the things we need a guarantee of safety on. People cannot be forced to function in any guaranteed manner. We don't have a sole source of truth on what guaranteed safe means. Also, who audits the auditor and what is the auditor's guarantee of safety in auditing? Where does the circular logic end? When you start mucking about with defining safety in interpreted language, you inherently destroy repeatability and reliability - you remove the guarantee. If something is complex enough to interpret free flowing human language and how we might define safety, it's not going to be reliable. Humans are not reliable. We created the language. We don't all universally agree what any of it means. What would change if the AI created it? People can't even secure the systems that host AI. Maybe retitle it Towards AI Safety Engineering.

1 Reaction

Jim Welsh

Esoteric Writing and insight to Awareness adaptation to Sustainability through the Human condition and its implications to Business and Holistic integration. Anthesis Designs.

2mo

The road to hell is paved With good intentions AI - the Wild West ….. No regulations , hubris of egoic bad actors of first advantage chasing money 💰 through the false promises of Singularity myth …. Now needing Trillions to free mankind ! What a king size delusion… Another neoliberal Ponzi scheme to milk the public purse and steal money from unsuspecting investors of greed. No regulations No guard rails And Now ( the industry) wants self regulations ? China 🇨🇳 is already way ahead in quantum computing. Cheaper and far less energy than the ‘proposed’ Nvidia Chips and GpU clusters which will require Massive Energy For what outcomes ? For who ? The self appointed ‘gurus’ who are trying to make a God from sand ? The jokes on us If we believe this tripe…. There are good and real uses of AI which deliver benefits But Monopolies are monopolies… Only metric is MONEY 💰 and it’s power of greed and control ? Time to wake up people

D. D. Sharma

Explore AI (Healthcare, Safety, Risk). Board Advisor (UC Merced).

2mo

Stuart Russell Max Tegmark Srini Narayanan At a quick blush the proposal maps the AI Safety problem to three problems. This framework of a world model (simulation), safety specification, verifier has been proposed in other safety related domains (telecom, aviation, nuclear power, etc.) and has not scaled well. Even if we can scope it well, how will the framework address intentional malignant misuse and the unknowns of N+1st scenarios that fall outside the safety and world model speculations.

2 Reactions

Sandeep Ozarde

Founder Director at Leaf Design; PhD Student at University of Hertfordshire

2mo

Stuart Russell Great paper indeed. I have a quick question from an academic standpoint: Do you think that Human-Centered AI (HCAI) Design is crucial for achieving safety specifications, developing accurate world models, and implementing effective verifiers? By involving users and stakeholders in the design process, AI systems can be tailored to meet their specific safety requirements. This includes considering potential risks, ethical considerations, and legal compliance. HCAI Design also focuses on understanding the "context" in which AI systems operate, incorporating human insights and expertise to create more accurate and comprehensive world models. The design approach ensures that the verifier is user-friendly and effective, taking into account the cognitive abilities and limitations of users and providing clear and intuitive interfaces. For the AI system to be considered safe, a verifier—an automated tool or process—must check and confirm that the AI system can handle each of these potential behaviours within the specified safety requirements. The verifier must ensure that the AI system's responses and actions remain safe and compliant under all the different possible human behaviours outlined in the world model.

Matthew Newman MAICD

2mo

Great paper. I still hold that the task of providing the world model to ground AI is a task for humanity, if performed properly. i also argued the role of curating the data that forms this world model, with all the complexity of historic, current and factual, is probably one of the most important we could imagine. Here: https://www.linkedin.com/posts/transform_to-the-curators-go-the-spoils-the-rising-activity-7111816574231056384-84z6?utm_source=share&utm_medium=member_desktop

1 Reaction

Gary Longsine

Fractional CTO. Collaborate • Deliver • Iterate. 📱

2mo

I mean, sure, we •should• and while we’re at it, we should tax fossil fuels too. 🧐🤔🍨

2 Reactions

Sebastián Rimsky - Strategic Leadership and Management

Dear all, I invite you to collaborate on my project about the application of artificial intelligence 🤖 in credit portfolio management 💼 in the public sector. Your participation is crucial to the success of this research 🌟. Below, I share a questionnaire 📋 that will be of great value to my project: https://lnkd.in/daEWfGQ4. I thank you in advance for your time and support 🙏, and I kindly ask you to share this questionnaire as widely as possible 🔄. Thank you very much for your cooperation 🙌. Best regards, Sebastián Rimsky

Adam Bostock

2mo

Specific #safety recommendations to build safer #AI systems. https://www.innovationfuturespecialist.co.uk/blog/aiblog.html

See more comments

To view or add a comment, sign in

More Relevant Posts

Dr. Paul Muntean

Principal Cyber Security System Architect @ Swisscom / Data, Analytics & AI
2mo Edited
Report this post
Check out this white paper proposing how to build guaranteed safe #AI systems. Btw. check the great researchers in the paper author list. This should give you some starting clues about how significant this paper is. What do you think about the presented approach? #ai #ncsc #cyd #armasuisse #microsoft #epfl #uzh #ethz #googleresearch #appleresearch #swisscom

Stuart Russell

Professor of Computer Science at UC Berkeley
2mo

A new white paper proposing that we should build AI systems that we can understand, predict, control, and make high-confidence statements about: https://lnkd.in/eNRS6ETc

Towards Guaranteed Safe AI: A Framework for Ensuring Robust and Reliable AI Systems

arxiv.org
Like Comment
To view or add a comment, sign in
Basak Ozan Özparlak, PhD

Assistant Professor | Researcher at Ozyegin University #ITLaw #AI_Policy #TrustworthyAI #6G IEEE Member
2mo
Report this post
A timely and must-read whitepaper from the AI pioneers. The authors propose verifiable and quantitative safety guarantees for AI and introduce the concept of guaranteed safe (GS) #AI. This work is crucial for trustworthy AI to be a down-to-earth aim.

Stuart Russell

Professor of Computer Science at UC Berkeley
2mo

A new white paper proposing that we should build AI systems that we can understand, predict, control, and make high-confidence statements about: https://lnkd.in/eNRS6ETc

Towards Guaranteed Safe AI: A Framework for Ensuring Robust and Reliable AI Systems

arxiv.org
Like Comment
To view or add a comment, sign in
Regis Martin
2mo
Report this post
This article proposes a framework for Guaranted Safe (GS) AI. The high level architecture of GS AI involves: 1. a world model (mathematical description of how the AI system affects the outside world), 2. a safety specification (mathematical description of what effects are acceptable), 3. a verifier (providing an auditable proof that the AI satisfies the safety specification relative to the world model). This article builds upon a wealth of prior publications. It provides a scale of AI safety levels, as well as safety scales applicable to world models, safety specifications and verifier. Numerous implementation strategies and research directions are described, some using AI. Current practices clearly do not provide strong safety assurances. As the implementation of GS AI is a difficult technical challenge, the authors advocate for more attention and effort. Progress seems possible, thanks to the diversity of possible strategies, and opportunities to capture value across the board, from narrow or partial solutions to more ambitious AI applications. #ai #responsibleai #genAI

Stuart Russell

Professor of Computer Science at UC Berkeley
2mo

A new white paper proposing that we should build AI systems that we can understand, predict, control, and make high-confidence statements about: https://lnkd.in/eNRS6ETc

Towards Guaranteed Safe AI: A Framework for Ensuring Robust and Reliable AI Systems

arxiv.org
Like Comment
To view or add a comment, sign in
Mark Montgomery

Founder & CEO of KYield. Pioneer in Artificial Intelligence, Data Physics and Knowledge Engineering.
2mo
Report this post
Three words used consistently and accurately in this paper -- framework, towards, and conceptual. Notably missing is executable, which we can do today with data management. The problem is the definition of AI in this paper is actually LLMs. Although this is a good step 'towards', it's early and still theoretical, during which time experimental systems with extremely weak safety engineering like OpenAI cited have already been scaled to a naive public, few of which are in a position to make a judgement on safety, and government representatives are failing to protect them. I'm skeptical LLMs can achieve safety-critical equivalents found in other technologies containing catastrophic risk. The physics involved are completely different. It will be a game of cat and mouse with LLMs for the foreseeable future.

Stuart Russell

Professor of Computer Science at UC Berkeley
2mo

A new white paper proposing that we should build AI systems that we can understand, predict, control, and make high-confidence statements about: https://lnkd.in/eNRS6ETc

Towards Guaranteed Safe AI: A Framework for Ensuring Robust and Reliable AI Systems

arxiv.org
Like Comment
To view or add a comment, sign in
Memia

1,187 followers
2mo
Report this post
Towards Guaranteed Safe AI: A Framework for Ensuring Robust and Reliable AI Systems

Towards Guaranteed Safe AI: A Framework for Ensuring Robust and Reliable AI Systems

arxiv.org
Like Comment
To view or add a comment, sign in
BI WORLDWIDE

35,457 followers
2mo
Report this post
There are 4 key facets to impactful recognition. Click to learn how you can use AI tools to impact all 4. https://bit.ly/3IDFLE8

Authentic recognition in an artificial world | BI WORLDWIDE

biworldwide.com
Like Comment
To view or add a comment, sign in
Louie Peters

This account is not monitored regularly, to get in touch with us, please email us [email protected]
12mo
Report this post
Navigating the Organizational Transformation in the Era of AI #MachineLearning #AI #generativeAI https://is.gd/bKxRYs #MachineLearning #Latest

Navigating the Organizational Transformation in the Era of AI

https://towardsai.net
Like Comment
To view or add a comment, sign in
Romano Feuerstein

Director bei KPMG Switzerland
10mo
Report this post
Public trust is vital for AI’s continued acceptance. Our latest global study examines the drivers of trust, the perceived risks and benefits of AI use, community expectations of governance of AI and who is trusted to develop, use and govern AI. Download the full report:

Study on shifting public perceptions of AI

kpmgswitzerland.smh.re
Like Comment
To view or add a comment, sign in
Yannick Huguet

Assistant Manager | Corporate Tax | KPMG Switzerland
10mo
Report this post
Public trust is vital for AI’s continued acceptance. Our latest global study examines the drivers of trust, the perceived risks and benefits of AI use, community expectations of governance of AI and who is trusted to develop, use and govern AI. Download the full report:

Study on shifting public perceptions of AI

kpmgswitzerland.smh.re
Like Comment
To view or add a comment, sign in

22,548 followers

View Profile Follow

Stuart Russell’s Post

Towards Guaranteed Safe AI: A Framework for Ensuring Robust and Reliable AI Systems

arxiv.org

More from this author

Making robots too human is bad for humans, bad for robots

Explore topics