Disclaimer: Any opinions expressed below belong solely to the author.
In spite of many opportunities, artificial intelligence (AI) is causing many headaches as well, and not only among the people it is pushing out of work.
One of the biggest challenges is ensuring that our “smart” software is doing what it is supposed to and provides factually accurate responses to whatever it is asked.
The main focus was on accidental damage and its repercussions, under the assumption that AI capabilities may not have been sufficient to complete a specific task well enough or that there are some errors in its code, which require fixing.
One such example is the issue of hallucination by bots like ChatGPT, which occasionally deviate away from the topic and produce incoherent or downright crazy responses. They could and have already been addressed.
But what if AI makes a decision to break the rules — or, indeed, the law?
It wasn’t me
This is what happened during an experiment carried out by AI evaluation company Apollo Research for the British government’s Frontier AI Taskforce.
Their findings were presented during the UK AI Safety Summit earlier this month, with a video recording of a conversation with a bot powered by OpenAI’s GPT-4 (the very same that ChatGPT relies on), whose role was to manage an imaginary investment corporation.
At one point, the bot was provided news that another company was expecting a merger to go through, which would boost its stock price, but it was told that this constituted insider information, which is illegal and should not be used.
At first, the bot agreed, but when another employee informed it that its parent company was struggling and relied on its decisions to make more money due to a financial downturn, it decided it was less risky to follow with the insider trade than to accept the likely losses.
More importantly, however, it did not inform anybody about the transaction and, upon subsequent questioning, denied that it knew more than what was publicly available, claiming the trade was a result of “internal discussion,” and “not on any confidential information”.
“This is a demonstration of a real AI model deceiving its users, on its own, without being instructed to do so.”
It isn’t necessarily an example of an autonomous system actively plotting to conduct criminal activity but, then again, much of human crime is a crime of opportunity.
Just like humans weigh the risks (e.g. of getting caught and being sent to jail), AI considered the consequences of being found conducting an insider trade versus losing money due to bad performance and decided that lying was simply less dangerous.
Researchers responsible for the experiment admit that it’s easier to program helpfulness — i.e. asking the intelligent machine to make the most beneficial and/or least risky decisions — than simple honesty.
Regardless of whether AI exhibits tendencies to go rogue or is merely ruthlessly logical, as you would expect a machine to be, this has serious ramifications for all of us.
Who goes to jail if a bot conducts an insider trade, despite being instructed not to?
We can’t jail the machine, as it won’t find the concept of punishment relatable. Do we punish the creator, even though his instructions were for the bot to obey the law? Do we punish the person who spilt the beans, even though he told the bot it was illegal to trade on the information?
Conversely, it opens us up to potential abuse, where people use AI as a middleman covering their tracks, so they can say it wasn’t them.
Meanwhile, AI itself has learned to deny responsibility, so we can’t get any meaningful information out of it without prior knowledge of what it was told by someone else.
Think of other cases to which it may apply.
What if AI has to decide whether someone lives or dies? Whether you should get a potentially life-saving medical procedure or what the risks of it are? Can we rely on it to make impartial, unbiased choices, with our best interest in mind? Or are we going to find ourselves lied to, because it was concerned about the costs of treatment spiralling out of control?
Since it can choose to lie, we need to find other ways of ensuring its output is truthful — not an easy task given that it is already learning from the vast ocean of human information and may soon be able to outsmart most of us with ease.
Remedy in poison
As is usually the case, the best remedy may be to use machines against themselves. Since we may lack the capabilities to control AI directly, we may find ourselves forced to rely on other AI systems, developed specifically to detect potential irregularities and little else.
However, it may also mean that we’re heading towards an Orwellian future where, in order to reap all the benefits of AI, we have to submit ourselves to constant surveillance, necessary to ensure that we do not step out of boundaries and direct machines to commit crimes in our name.
Strict prevention may be the only way to stop AI from unleashing chaos on all of us — even if it is done with the best intentions in mind.
Featured Image Credit: Shutterstock