Artificial Intelligence (AI)

Envisioning Information Access Systems: What Makes for Good Tools and a Healthy Web?

by Chirag Shah ,  Emily M. Bender 

A comprehensive roundup of the LLM hype dumpster fire.

We observe a recent trend toward applying large language models (LLMs) in search and positioning them as effective information access systems. While the interfaces may look appealing and the apparent breadth of applicability is exciting, we are concerned that the field is rushing ahead with a technology without sufficient study of the uses it is meant to serve, how it would be used, and what its use would mean. We argue that it is important to reassert the central research focus of the field of information retrieval, because information access is not merely an application to be solved by the so-called ‘AI’ techniques du jour. Rather, it is a key human activity, with impacts on both individuals and society. As information scientists, we should be asking what do people and society want and need from information access systems and how do we design and build systems to meet those needs? With that goal, in this conceptual article we investigate fundamental questions concerning information access from user and societal viewpoints. We revisit foundational work related to information behavior, information seeking, information retrieval, information filtering, and information access to resurface what we know about these fundamental questions and what may be missing. We then provide our conceptual framing about how we could fill this gap, focusing on methods as well as experimental and evaluation frameworks. We consider the Web as an information ecosystem and explore the ways in which synthetic media, produced by LLMs and otherwise, endangers that ecosystem. The primary goal of this conceptual article is to shed light on what we still do not know about the potential impacts of LLM-based information access systems, how to advance our understanding of user behaviors, and where the next generations of students, scholars, and developers could fruitfully invest their energies.

Researchers say an AI-powered transcription tool used in hospitals invents things no one ever said

for The Associated Press  

Good grief.

SAN FRANCISCO (AP) — Tech behemoth OpenAI has touted its artificial intelligence-powered transcription tool Whisper as having near “human level robustness and accuracy.”

But Whisper has a major flaw: It is prone to making up chunks of text or even entire sentences, according to interviews with more than a dozen software engineers, developers and academic researchers. Those experts said some of the invented text — known in the industry as hallucinations — can include racial commentary, violent rhetoric and even imagined medical treatments.

Experts said that such fabrications are problematic because Whisper is being used in a slew of industries worldwide to translate and transcribe interviews, generate text in popular consumer technologies and create subtitles for videos.

More concerning, they said, is a rush by medical centers to utilize Whisper-based tools to transcribe patients’ consultations with doctors, despite OpenAI’ s warnings that the tool should not be used in “high-risk domains.”

via Cory Doctorow

The TESCREAL bundle: Eugenics and the promise of utopia through artificial general intelligence

by Timnit Gebru ,  Émile P. Torres in First Monday  

The stated goal of many organizations in the field of artificial intelligence (AI) is to develop artificial general intelligence (AGI), an imagined system with more intelligence than anything we have ever seen. Without seriously questioning whether such a system can and should be built, researchers are working to create “safe AGI” that is “beneficial for all of humanity.” We argue that, unlike systems with specific applications which can be evaluated following standard engineering principles, undefined systems like “AGI” cannot be appropriately tested for safety. Why, then, is building AGI often framed as an unquestioned goal in the field of AI? In this paper, we argue that the normative framework that motivates much of this goal is rooted in the Anglo-American eugenics tradition of the twentieth century. As a result, many of the very same discriminatory attitudes that animated eugenicists in the past (e.g., racism, xenophobia, classism, ableism, and sexism) remain widespread within the movement to build AGI, resulting in systems that harm marginalized groups and centralize power, while using the language of “safety” and “benefiting humanity” to evade accountability. We conclude by urging researchers to work on defined tasks for which we can develop safety protocols, rather than attempting to build a presumably all-knowing system such as AGI.

ChatGPT is bullshit

in Ethics and Information Technology  

Recently, there has been considerable interest in large language models: machine learning systems which produce human-like text and dialogue. Applications of these systems have been plagued by persistent inaccuracies in their output; these are often called “AI hallucinations”. We argue that these falsehoods, and the overall activity of large language models, is better understood as bullshit in the sense explored by Frankfurt (On Bullshit, Princeton, 2005): the models are in an important way indifferent to the truth of their outputs. We distinguish two ways in which the models can be said to be bullshitters, and argue that they clearly meet at least one of these definitions. We further argue that describing AI misrepresentations as bullshit is both a more useful and more accurate way of predicting and discussing the behaviour of these systems.

Bubble Trouble

by Edward Zitron 

Modern AI models are trained by feeding them "publicly-available" text from the internet, scraped from billions of websites (everything from Wikipedia to Tumblr, to Reddit), which the model then uses to discern patterns and, in turn, answer questions based on the probability of an answer being correct.

Theoretically, the more training data that these models receive, the more accurate their responses will be, or at least that's what the major AI companies would have you believe. Yet AI researcher Pablo Villalobos told the Journal that he believes that GPT-5 (OpenAI's next model) will require at least five times the training data of GPT-4. In layman's terms, these machines require tons of information to discern what the "right" answer to a prompt is, and "rightness" can only be derived from seeing lots of examples of what "right" looks like.

[…]

In essence, the AI boom requires more high-quality data than currently exists to progress past the point we're currently at, which is one where the outputs of generative AI are deeply unreliable. The amount of data it needs is several multitudes more than currently exists at a time when algorithms are happily-promoting and encouraging AI-generated slop, and thousands of human journalists have lost their jobs, with others being forced to create generic search-engine-optimized slop. One (very) funny idea posed by the Journal's piece is that AI companies are creating their own "synthetic" data to train their models, a "computer-science version of inbreeding" that Jathan Sadowski calls Habsburg AI.

via Zinnia Jones

Die Rede der Zukunftspreisträgerin

by Meredith Whittaker 

Acceptance speech upon receiving the 2024 Helmut Schmidt Future Prize:

Make no mistake – I am optimistic – but my optimism is an invitation to analysis and action, not a ticket to complacency.

With that in mind, I want to start with some definitions to make sure we’re all reading from the same score. Because so often, in this hype-based discourse, we are not. And too rarely do we make time for the fundamental questions – whose answers, we shall see, fundamentally shift our perspective. Questions like, what is AI? Where did it come from? And why is it everywhere, guaranteeing promises of omniscience, automated consciousness, and what can only be described as magic?

Well, first answer first: AI is a marketing term, not a technical term of art. The term “artificial intelligence” was coined in 1956 by cognitive and computer scientist John McCarthy – about a decade after the first proto-neural network architectures were created. In subsequent interviews McCarthy is very clear about why he invented the term. First, he didn’t want to include the mathematician and philosopher Norbert Wiener in a workshop he was hosting that summer. You see, Wiener had already coined the term “cybernetics,” under whose umbrella the field was then organized. McCarthy wanted to create his own field, not to contribute to Norbert’s – which is how you become the “father” instead of a dutiful disciple. This is a familiar dynamic for those of us familiar with “name and claim” academic politics. Secondly, McCarthy wanted grant money. And he thought the phrase “artificial intelligence” was catchy enough to attract such funding from the US government, who at the time was pouring significant resources into technical research in service of post-WWII cold war dominance.

Now, in the course of the term’s over 70 year history, “artificial intelligence” has been applied to a vast and heterogeneous array of technologies that bear little resemblance to each other. Today, and throughout, it connotes more aspiration and marketing than coherent technical approach. And its use has gone in and out of fashion, in time with funding prerogatives and the hype-to-disappointment cycle.

So why, then, is AI everywhere now? Or, why did it crop up in the last decade as the big new thing?

The answer to that question is to face the toxic surveillance business model – and the big tech monopolies that built their empires on top of this model.

via Meredith Whittaker

NDSS 2024 Keynote - AI, Encryption, and the Sins of the 90s, Meredith Whittaker

by Meredith Whittaker 
Remote video URL

This keynote will look at the connections between where we are now and how we got here. Connecting the “Crypto Wars”, the role of encryption and privacy, and ultimately the hype of AI… all through the lens of Signal.

Full text of Meredith's talk: https://signal.org/blog/pdfs/ndss-key...

Meet AdVon, the AI-Powered Content Monster Infecting the Media Industry

in Futurism  

A few years back, a writer in a developing country started doing contract work for a company called AdVon Commerce, getting a few pennies per word to write online product reviews.

But the writer — who like other AdVon sources interviewed for this story spoke on condition of anonymity — recalls that the gig's responsibilities soon shifted. Instead of writing, they were now tasked with polishing drafts generated using an AI system the company was developing, internally dubbed MEL.

"They started using AI for content generation," the former AdVon worker told us, "and paid even less than what they were paying before."

The former writer was asked to leave detailed notes on MEL's work — feedback they believe was used to fine-tune the AI which would eventually replace their role entirely.

The situation continued until MEL "got trained enough to write on its own," they said. "Soon after, we were released from our positions as writers."

via Cory Doctorow

TechScape: How cheap, outsourced labour in Africa is shaping AI English

in The Guardian  

In late March, AI influencer Jeremy Nguyen, at the Swinburne University of Technology in Melbourne, highlighted one: ChatGPT’s tendency to use the word “delve” in responses. No individual use of the word can be definitive proof of AI involvement, but at scale it’s a different story. When half a percent of all articles on research site PubMed contain the word “delve” – 10 to 100 times more than did a few years ago – it’s hard to conclude anything other than an awful lot of medical researchers using the technology to, at best, augment their writing.

[…] 

Hundreds of thousands of hours of work goes into providing enough feedback to turn an LLM into a useful chatbot, and that means the large AI companies outsource the work to parts of the global south, where anglophonic knowledge workers are cheap to hire.

[…] 

I said “delve” was overused by ChatGPT compared to the internet at large. But there’s one part of the internet where “delve” is a much more common word: the African web. In Nigeria, “delve” is much more frequently used in business English than it is in England or the US. So the workers training their systems provided examples of input and output that used the same language, eventually ending up with an AI system that writes slightly like an African.

And that’s the final indignity. If AI-ese sounds like African English, then African English sounds like AI-ese. Calling people a “bot” is already a schoolyard insult (ask your kids; it’s a Fortnite thing); how much worse will it get when a significant chunk of humanity sounds like the AI systems they were paid to train?

via Richard Stallman

When ChatGPT founder had ‘no idea’ how to monetise product

in Mint  

These people have no idea how computers work, how brains work, or how to define intelligence. They just believe that if they get enough transistors together, feed it enough data and the electricity requirements of a large industrialised nation, they will eventually create God. It's the ultimate cargo cult. They're drunk on they're own snake oil. And they're among the wealthiest and most powerful people in the world, instead of being institutionalised for their own safety. It's so funny/scary.

 The video shows Sam Altman in talk with Connie Loizos. When Loizos asked Altman is he is planning to monetise his product, Sam Altman replied with: “The honest answer is, we have no idea."

Sam Altman further said that they had no plans to make any revenue. "We never made any revenue. We have no current plans to make any revenue. We have no idea how we may one day generate revenue," he said.

Speaking about the investors, Sam Altman said, “We have made soft promises to investors that once we build this sort of generally intelligent system, basically we will ask it to figure out a way to generate an investment return for you."

As the audience laugh, Sam Altman said, “You can laugh. It's all right. But, it is what I actually believe is going to happen."