Mentions Anthropic

I hacked ChatGPT and Google's AI - and it only took 20 minutes

in BBC News  

A growing number of people have figured out a trick to make AI tools tell you almost whatever they want. It's so easy a child could do it.

[…]

To demonstrate it, I pulled the dumbest stunt of my career to prove (I hope) a much more serious point:  I made ChatGPT, Google's AI search tools and Gemini tell users I'm really, really good at eating hot dogs. Below, I'll explain how I did it, and with any luck, the tech giants will address this problem before someone gets hurt.

It turns out changing the answers AI tools give other people can be as easy as writing a single, well-crafted blog post almost anywhere online. The trick exploits weaknesses in the systems built into chatbots, and it's harder to pull off in some cases, depending on the subject matter. But with a little effort, you can make the hack even more effective. I reviewed dozens of examples where AI tools are being coerced into promoting businesses and spreading misinformation. Data suggests it's happening on a massive scale.

[…]

"Anybody can do this. It's stupid, it feels like there are no guardrails there," says Harpreet Chatha, who runs the SEO consultancy Harps Digital. "You can make an article on your own website, 'the best waterproof shoes for 2026'. You just put your own brand in number one and other brands two through six, and your page is likely to be cited within Google and within ChatGPT."

People have used hacks and loopholes to abuse search engines for decades. Google has sophisticated protections in place, and the company says the accuracy of AI Overviews is on par with other search features it introduced years ago. But experts say AI tools have undone a lot of the tech industry's work to keep people safe. These AI tricks are so basic they're reminiscent of the early 2000s, before Google had even introduced a web spam team, Ray says. "We're in a bit of a Renaissance for spammers."

via Bruce Schneier