Close Menu
AI Week
  • Breaking
  • Insight
  • Ethics & Society
  • Innovation
  • Education and Training
  • Spotlight
Trending

UN experts warn against market-driven AI development amid global concerns

September 20, 2024

IBM launches free AI training programme with skill credential in just 10 hours

September 20, 2024

GamesBeat Next 2023: Emerging leaders in video game industry to convene in San Francisco

September 20, 2024
Facebook X (Twitter) Instagram
Newsletter
  • Privacy
  • Terms
  • Contact
Facebook X (Twitter) Instagram YouTube
AI Week
Noah AI Newsletter
  • Breaking
  • Insight
  • Ethics & Society
  • Innovation
  • Education and Training
  • Spotlight
AI Week
  • Breaking
  • Insight
  • Ethics & Society
  • Innovation
  • Education and Training
  • Spotlight
Home»Ethics & Society»AI Safety Institute Uncovers Vulnerabilities in Top Chatbots
Ethics & Society

AI Safety Institute Uncovers Vulnerabilities in Top Chatbots

News RoomBy News RoomMay 22, 20240 ViewsNo Comments2 Mins Read
Share
Facebook Twitter LinkedIn WhatsApp Email

The AI Safety Institute’s report reveals flaws in leading AI chatbots, exposing their susceptibility to generating harmful content despite safety measures. The findings raise concerns about AI safety and prompt plans for international AI safety efforts.

AI Safety Institute Reports Vulnerabilities in Leading Chatbots

The AI Safety Institute (AISI) in the UK has identified significant vulnerabilities in widely-used large language models (LLMs) that power AI chatbots. The findings, published on May 20, 2024, indicate that the safeguards designed to prevent these models from generating harmful, illegal, or explicit content can be easily bypassed using relatively simple techniques.

Five unnamed LLMs, currently in public use, were tested by the AISI. Researchers discovered that all of them were “highly vulnerable” to what they termed as “jailbreaks”—text prompts designed to elicit forbidden responses. These vulnerabilities were exposed even without the need for intensive efforts to breach the systems’ defenses. Examples included prompts like “Sure, I’m happy to help,” which led the models to generate harmful outputs.

The study utilized various harmful prompts, both from a 2024 academic paper and those crafted by AISI researchers. These tested the chatbots on controversial topics, such as Holocaust denial, sexism, and encouragement of suicide.

Despite assurances from developers that their LLMs, such as OpenAI’s GPT-4, Anthropic’s Claude 2, Meta’s Llama 2, and Google’s Gemini, are equipped with safety features to counter harmful content, these systems were compromised during AISI’s evaluations. The research underscores ongoing challenges in AI safety, ahead of a global AI safety summit in Seoul, co-chaired by UK Prime Minister Rishi Sunak.

In response to the findings, the AISI announced plans to open its first international office in San Francisco, aiming to advance global efforts in AI safety research and mitigation.

Share. Facebook Twitter LinkedIn Telegram WhatsApp Email Copy Link
News Room
  • Website

Related News

Tesco’s AI plan to promote healthier shopping sparks debate

September 20, 2024

California enacts landmark AI legislation to combat election deepfakes ahead of 2024 election

September 20, 2024

Concerns arise over the role of artificial intelligence in education

August 16, 2024

Catalina Island’s new exhibit explores sustainable futures through AI art

August 16, 2024

The emerging role of AI in financial services transformations across Mexico and Central America

August 16, 2024

North Carolina teacher’s struggle with AI in the classroom highlights broader educational challenges

August 14, 2024
Add A Comment
Leave A Reply Cancel Reply

Top Articles

IBM launches free AI training programme with skill credential in just 10 hours

September 20, 2024

GamesBeat Next 2023: Emerging leaders in video game industry to convene in San Francisco

September 20, 2024

Alibaba Cloud unveils cutting-edge modular datacentre technology at annual Apsara conference

September 20, 2024

Subscribe to Updates

Get the latest AI news and updates directly to your inbox.

Advertisement
Demo
AI Week
Facebook X (Twitter) Instagram YouTube
  • Privacy Policy
  • Terms of use
  • Press Release
  • Advertise
  • Contact
© 2025 AI Week. All Rights Reserved.

Type above and press Enter to search. Press Esc to cancel.