The Wikimedia Foundation, stewards of the finest projects on the web, have written about the hammering their servers are taking from the scraping bots that feed large language models.
[âŠ]
When we talk about the unfair practices and harm done by training large language models, we usually talk about it in the past tense: how they were trained on other peopleâs creative work without permission. But this is an ongoing problem thatâs just getting worse.
The worst of the internet is continuously attacking the best of the internet. This is a distributed denial of service attack on the good parts of the World Wide Web.
If youâre using the products powered by these attacks, youâre part of the problem. Donât pretend itâs cute to ask ChatGPT for something. Donât pretend itâs somehow being technologically open-minded to continuously search for nails to hit with the latest âAIâ hammers.
If youâre going to use generative tools powered by large language models, donât pretend you donât know how your sausage is made.
Artificial Intelligence (AI)
Denial
FOSS infrastructure is under attack by AI companies
in LibreNewsThree days ago, Drew DeVault - founder and CEO of SourceHut - published a blogpost called, "Please stop externalizing your costs directly into my face", where he complained that LLM companies were crawling data without respecting robosts.txt and causing severe outages to SourceHut.
[âŠ]
Then, yesterday morning, KDE GitLab infrastructure was overwhelmed by another AI crawler, with IPs from an Alibaba range; this caused GitLab to be temporarily inaccessible by KDE developers.
[âŠ]
By now, it should be pretty clear that this is no coincidence. AI scrapers are getting more and more aggressive, and - since FOSS software relies on public collaboration, whereas private companies don't have that requirement - this is putting some extra burden on Open Source communities.
Samsung caught faking zoom photos of the Moon
in The VergeFor years, Samsung âSpace Zoomâ-capable phones have been known for their ability to take incredibly detailed photos of the Moon. But a recent Reddit post showed in stark terms just how much computational processing the company is doing, and â given the evidence supplied â it feels like we should go ahead and say it: Samsungâs pictures of the Moon are fake.
[âŠ]
The test of Samsungâs phones conducted by Reddit user u/ibreakphotos was ingenious in its simplicity. They created an intentionally blurry photo of the Moon, displayed it on a computer screen, and then photographed this image using a Samsung S23 Ultra. As you can see below, the first image on the screen showed no detail at all, but the resulting picture showed a crisp and clear âphotographâ of the Moon. The S23 Ultra added details that simply werenât present before. There was no upscaling of blurry pixels and no retrieval of seemingly lost data. There was just a new Moon â a fake one.
"Hopelessly and Inseperably Entangled with Drupal" A Candid Conversation with Karoly Negyesi aka Chx
in The Drop TimesStanding ovation for chx:
Karoly Negyesi: Well, even framing this as "AI" is misleading. The entire field is essentially based on a short paper written by John von Neumann in the 1950s. In that paper, he declaredâwithout a single shred of proof, and yet people readily believed itâthat the human brain is obviously digital. People have believed this so strongly that even today, neuroscientists struggle to describe how the brain works without using digital metaphors. But the truth is, the human brain does not work like a computer.
So, calling these statistical pattern-matching systems "artificial intelligence" is just misleading. 'Retrieve a memory', your brain doesnât retrieve a memory. Itâs not a computer. It never was. Everybody knows this. You never retrieve a memory the way a computer does. You do not store your memories as a computer does. That whole concept is just not true.
There was a brilliant book about this a couple of years back that described how, in different eras, people compared the brain to whatever technology was available to them. Descartes compared it to a machine. Von Neumann compared it to a digital computer. None of that is true. Of course, we still donât quite know how the brain actually works. So then we pursue something called artificial intelligence, and by that, we mean something that matches this completely misplaced and untrue metaphor of the brain.
The whole premise of artificial intelligence is broken. Itâs just not true. You are building a castle on quicksand. Thereâs nothing there. And beyond this, thereâs just so much wrong with it. Almost blindly trusting whatever a large language model spits back at youâbecause, once again, I donât think people fully understand or even partially understand what they are getting.
So, no, I donât think AI is progressing in the way people think it is. I mean, obviously, thereâs some progress, but it is not going where people think it can go. Itâs never going to match a human brainâat least not this way. And quite likely, not within our lifetimes. Probably not even within a few centuries. We will not have a machine that is capable of doing what the human brain is capable of. Mostly becauseâwe still have no clue how the brain actually works.
Why Canât ChatGPT Draw a Full Glass of Wine?
for YouTubeChatGPT canât draw a glass of wine full to the brim. Why? And what might it have to do with David Hume and the missing shade of blue?
Power Cut
Microsoft has, through a combination of canceled leases, pullbacks on Statements of Qualifications, cancellations of land parcels and deliberate expiration of Letters of Intent, effectively abandoned data center expansion equivalent to over 14% of its current capacity.
[âŠ]
The reason I'm writing in such blunt-force terms is that I want to make it clear that Microsoft is effectively cutting its data center expansion by over a gigawatt of capacity, if not more, and itâs impossible to reconcile these cuts with the expectation that generative AI will be a massive, transformative technological phenomenon.
I believe the reason Microsoft is cutting back is that it does not have the appetite to provide further data center expansion for OpenAI, and itâs having doubts about the future of generative AI as a whole. If Microsoft believed there was a massive opportunity in supporting OpenAI's further growth, or that it had "massive demand" for generative AI services, there would be no reason to cancel capacity, let alone cancel such a significant amount.
[âŠ]
Microsoft is cancelling plans to massively expand its data center capacity right at a time when OpenAI just released its most computationally-demanding model ever. How do you reconcile those two things without concluding either that Microsoft expects GPT-4.5 to be a flop, or that itâs simply unwilling to continue bankrolling OpenAIâs continued growth, or that itâs having doubts about the future of generative AI as a whole?
[âŠ]
Generative AI does not have meaningful mass-market use cases, and while ChatGPT may have 400 million weekly active users, as I described last week, there doesnât appear to be meaningful consumer adoption outside of ChatGPT, mostly because almost all AI coverage inevitably ends up marketing one company: OpenAI. Argue with me all you want about your personal experiences with ChatGPT, or how youâve found it personally useful. That doesnât make it a product with mass-market utility, or enterprise utility, or worth the vast sums of money being ploughed into generative AI.
AI Personality Extraction from Faces: Labor Market Implications
The stupid use cases for AI just keep coming:
Human capital---encompassing cognitive skills and personality traits---is critical for labor market success, yet the personality component remains difficult to measure at scale. Leveraging advances in artificial intelligence and comprehensive LinkedIn microdata, we extract the Big 5 personality traits from facial images of 96,000 MBA graduates, and demonstrate that this novel ``Photo Big 5'' predicts school rank, compensation, job seniority, industry choice, job transitions, and career advancement. Using administrative records from top-tier MBA programs, we find that the Photo Big 5 exhibits only modest correlations with cognitive measures like GPA and standardized test scores, yet offers comparable incremental predictive power for labor outcomes. Unlike traditional survey-based personality measures, the Photo Big 5 is readily accessible and potentially less susceptible to manipulation, making it suitable for wide adoption in academic research and hiring processes. However, its use in labor market screening raises ethical concerns regarding statistical discrimination and individual autonomy
Knowing less about AI makes people more open to having it in their lives â new research
in The ConversationFrom the authors of "People Who Don't Understand Magic Trick More Likely To Be Impressed By It".
People with less knowledge about AI are actually more open to using the technology. We call this difference in adoption propensity the âlower literacy-higher receptivityâ link.
This link shows up across different groups, settings and even countries. For instance, our analysis of data from market research company Ipsos spanning 27 countries reveals that people in nations with lower average AI literacy are more receptive towards AI adoption than those in nations with higher literacy.
Similarly, our survey of US undergraduate students finds that those with less understanding of AI are more likely to indicate using it for tasks like academic assignments.
Envisioning Information Access Systems: What Makes for Good Tools and a Healthy Web?
A comprehensive roundup of the LLM hype dumpster fire.
We observe a recent trend toward applying large language models (LLMs) in search and positioning them as effective information access systems. While the interfaces may look appealing and the apparent breadth of applicability is exciting, we are concerned that the field is rushing ahead with a technology without sufficient study of the uses it is meant to serve, how it would be used, and what its use would mean. We argue that it is important to reassert the central research focus of the field of information retrieval, because information access is not merely an application to be solved by the so-called âAIâ techniques du jour. Rather, it is a key human activity, with impacts on both individuals and society. As information scientists, we should be asking what do people and society want and need from information access systems and how do we design and build systems to meet those needs? With that goal, in this conceptual article we investigate fundamental questions concerning information access from user and societal viewpoints. We revisit foundational work related to information behavior, information seeking, information retrieval, information filtering, and information access to resurface what we know about these fundamental questions and what may be missing. We then provide our conceptual framing about how we could fill this gap, focusing on methods as well as experimental and evaluation frameworks. We consider the Web as an information ecosystem and explore the ways in which synthetic media, produced by LLMs and otherwise, endangers that ecosystem. The primary goal of this conceptual article is to shed light on what we still do not know about the potential impacts of LLM-based information access systems, how to advance our understanding of user behaviors, and where the next generations of students, scholars, and developers could fruitfully invest their energies.
Researchers say an AI-powered transcription tool used in hospitals invents things no one ever said
for The Associated PressGood grief.
SAN FRANCISCO (AP) â Tech behemoth OpenAI has touted its artificial intelligence-powered transcription tool Whisper as having near âhuman level robustness and accuracy.â
But Whisper has a major flaw: It is prone to making up chunks of text or even entire sentences, according to interviews with more than a dozen software engineers, developers and academic researchers. Those experts said some of the invented text â known in the industry as hallucinations â can include racial commentary, violent rhetoric and even imagined medical treatments.
Experts said that such fabrications are problematic because Whisper is being used in a slew of industries worldwide to translate and transcribe interviews, generate text in popular consumer technologies and create subtitles for videos.
More concerning, they said, is a rush by medical centers to utilize Whisper-based tools to transcribe patientsâ consultations with doctors, despite OpenAIâ s warnings that the tool should not be used in âhigh-risk domains.â