The Tony Blair Institute for Global Change, a non-profit organisation set up by the ex-UK prime minister, has released a paper (PDF) predicting that AI automation in public sector jobs could save a fifth of workers time, along with a huge reduction in workforce and governmental costs. The findings of the paper were presented by Tony Blair himself at the opening of the 2024 Future of Britain Conference.
Just one small issue: The prediction was made by Chat GPT. And as experts 404 Media interviewed about this weird ouroboros of a report have noted, AI is maybe not the most reliable source for information about how reliable, useful, or beneficial AI might be.
The Tony Blair Institute researchers gathered data from O*NET based on occupation-specific descriptors on nearly 1,000 US occupations, with the aim to assess which of these tasks could be performed by AI. However, talking to human experts in order to define which roles could be suitable for AI automation was deemed too difficult a problem to solve, so they funnelled the data into ChatGPT in order to make a prediction instead.
Trouble is, as the researchers noted themselves, LLMs “may or may not give reliable results.” The solution? Ask it again, but differently.
“We first use GPT-4 to categorise each of the 19,281 tasks in the O*NET database in several different respects that we consider to be important determinants of whether the task can be performed by AI or not. These were chosen following an initial analysis of GPT-4’s unguided assessment of the automatability of some sample tasks, in which it struggled with some assessments”
“This categorisation enables us to generate a prompt to GPT-4 that contains an initial assessment as to whether it is likely that the task can or cannot be performed by AI.”
So that’d be AI, deciding which jobs can be improved by AI, and then concluding that AI would be beneficial. Followed by an international figure extolling the virtues of that conclusion to the rest of the world.
Unsurprisingly, those looking into the details of the report are unsatisfied with the veracity of the results. As Emily Bender—a University of Washington professor in the Computational Linguistics Laboratory, interviewed by 404 media—puts it:
“This is absurd—they might as well be shaking at Magic 8 ball and writing down the answers it displays”
“They suggest that prompting GPT-4 in two different ways will somehow make the results reliable. It doesn’t matter how you mix and remix synthetic text extruded from one of these machines—no amount of remixing will turn it into a sound empirical basis.”
(Image credit: Jakub Porzycki/NurPhoto via Getty Images)
What is artificial general intelligence?: We dive into the lingo of AI and what the terms actually mean.
The findings were reported by several news outlets without mention of the ChatGPT involvement in the papers predictions. It’s unknown if Big Tony knew that the information he was presenting was based on less than reliable data methods, or indeed, read the paper in detail himself.
While the researchers here at least documented their flawed methodology, it does make you wonder how much seemingly-accurate information is being created based on AI predictions, then presented as verifiable fact.
Nor, for that matter, content created by AI with just enough believability to pass without serious investigation. To prove that this article isn’t an example of such content, here’s a spelling mistike. You’re welcome.