Exploring the dark side of ChatGPT – are your conversations safe on the AI chatbot?

Yoganeetha Sivakumar

2 years ago

When OpenAI launched ChatGPT last November, it wowed millions of tech enthusiasts. The generative AI chatbot quickly rose to fame, amassing over 100 million users in just two months of its launch.

In a blog post posted by Bill Gates, he described the tech as “revolutionary”, and had claimed that he “had just seen the most important advance in technology since the graphical user interface”.

And I wholeheartedly agree with this.

From healthcare to finance and media, it really is undeniable that the dawn of this new tech can disrupt various industries. In fact, people are already utilising the tech to draft emails, create Python codes, and designing resumes, among other uses, simplifying a huge chunk of their daily workload.

But, alas, all that glitters is not always gold.

While news headlines are filled with the endless possibilities of generative AI, what’s not garnered as much attention are the risks that ChatGPT poses to all of us — issues that deserve closer scrutiny.

ChatGPT was trained on a massive amount of (copyrighted) data

The thing is, ChatGPT only functions as well as it does today mainly because it was trained on a massive amount of data — specifically, 570GB of data (or about 300 billion words).

These data were obtained from books, websites, Wikipedia, articles and other pieces of writing on the Internet, allowing the generative AI to learn a broad range of concepts. There’s a high chance that the comments you and I have left on social media platforms were consumed by the chatbot.

What this means is that the responses and text generated by ChatGPT would be based on these data.

However, this raises several copyright issues as the content generated by ChatGPT are typically protected by copyright. This essentially means that only the owners of these original works have the exclusive rights to use and distribute the content.

The usage of these copyrighted content may be seen as a violation of these rights, and legal action could potentially be taken against ChatGPT or its users.

In fact, shortly after computational journalist Francesco Marconi unveiled that the works of mainstream media outlets were utilised to train ChatGPT, Dow Jones, the publisher of The Wall Street Journal, released a statement that those who want to use the work of its journalists to train AI should be properly licensing the rights to do so from the company.

ChatGPT is trained on a large amount of news data from top sources that fuel its AI. It's unclear whether OpenAI has agreements with all of these publishers. Scraping data without permission would break the publishers' terms of service. pic.twitter.com/RXEjMHWXiI
— Francesco Marconi (@fpmarconi) February 15, 2023

CNN, another one of the mainstream media outlets listed by Francesco, also plans to reach out to OpenAI regarding being paid to license their content. Legal action can be anticipated if this matter is not resolved, NewsBreak Original reported.

ChatGPT is a data privacy nightmare

Aside from the data it has been trained on, ChatGPT may also store data obtained from its users, which they share during conversations with the chatbot.

But herein lies the problem — users may inadvertently divulge personal and sensitive information during their conversations with ChatGPT. This may include phone numbers, login credentials and email addresses, financial information such as bank account details, as well as health information.

These information and data will be collected, processed and stored indefinitely. According to OpenAI, your conversations with ChatGPT may be reviewed by its AI trainers to improve its systems.

Some users have reported seeing other people’s conversation histories on ChatGPT / Image Credit: ChatGPT

Beyond these data, OpenAI also automatically scrapes a wide range of user data when you access ChatGPT, such as IP addresses, device details, as well as browser type and settings, according to its privacy policy.

Considering the fact that ChatGPT is an online-based service, the generative AI chatbot may be vulnerable to cyber attacks and data breaches, compromising the data it has stored — which was exactly what happened two weeks ago.

On March 20, a bug exposed the titles of some users’ previous conversations with ChatGPT to other users using the chatbot. These titles are automatically created whenever a user starts a conversation with ChatGPT.

If you use #ChatGPT be careful! There's a risk of your chats being shared to other users!
Today I was presented another user's chat history.
I couldn't see contents, but could see their recent chats' titles.#security #privacy #openAI #AI pic.twitter.com/DLX3CZntao
— Jordan L Wheeler (@JordanLWheeler) March 20, 2023

A user on Twitter impacted by the bug

It was also possible that the first message of newly-created conversations were visible in someone else’s chat history if both users were active around the same time.

This was why OpenAI temporarily shut down ChatGPT. The same day, the chatbot was brought back online, however, ChatGPT’s history feature was disabled for approximately nine hours while OpenAI worked to fix the issue.

According to OpenAI CEO, Sam Altman, the “significant issue in ChatGPT” was caused by a “bug in an open source library”.

However, the company recently revealed that the same bug may have exposed payment-related information of 1.2 per cent of the ChatGPT Plus subscribers who were active during this window.

Although it is unclear how many users are subscribers of ChatGPT Plus, the premium version of ChatGPT, it has nonetheless sparked data privacy concerns. In fact, organisations such as JPMorgan Chase, Amazon, Verizon and Accenture have barred their employees from using ChatGPT for work purposes amidst fears that confidential information may be leaked.

OpenAI, on the other hand, seems to be untroubled by this. Other than a rather unhelpful disclaimer not to share sensitive information on its FAQ page, the company has not taken further steps to address privacy issues.

But then again, OpenAI is a private, for-profit company whose interests do not necessarily align with societal needs.

Does ChatGPT even make sense?

ChatGPT is no doubt capable of churning essays, long pieces of code and the likes, but do they even make sense?

ChatGPT is a type of large language model (LLM), a deep learning algorithm, which is prone to producing “AI hallucinations”. In other words, the generative AI chatbot generates responses that sound possible, but are factually incorrect or unrelated to the given context.

Here’s what it might look like:

User Input: When did Singapore gain independence?

ChatGPT’s Response: Singapore gained independence from Malaysia on 10 October, 1963, as a result of political differences and tensions between the ruling parties of Singapore and Malaysia.

The reason why ChatGPT spews incorrect information is because of the AI model’s lack of real world understanding.

According to French computer scientist Yann LeCun, LLMs have “no idea of the underlying reality that language describes”. AI systems also “generate texts that sound fine grammatically and semantically, but they don’t really have some sort of objective other than just satisfying statistical consistency with the prompt”.

This is why the information spouted by the generative AI can sometimes be misleading. For instance, in a test conducted by the Straits Times, ChatGPT miserably flunked the PSLE examinations.

But with ChatGPT confusing fact and fiction, it can cause disasters especially when it comes to making critical decisions in finance, healthcare and law.

Although Ilya Sutskever, OpenAI’s chief scientist and one of the creators of ChatGPT, says he’s confident that the problem will disappear with time as LLMs learn to anchor their responses in reality, the chatbot is still incapable of producing accurate and reliable responses today.

Even GPT-4, the latest iteration of ChatGPT that was supposedly 40 per cent more likely to produce more factual responses, according to Altman, has actually been reported to surface prominent false narratives more frequently and more persuasively than its previous version.

AI can simplify tasks, even when it comes to hacking

From generating whole articles to coding, ChatGPT can provide the layman with access to a vast pool of tools that can enhance work processes.

But since the dawn of ChatGPT, security experts have raised the alarm bells that the AI-enabled chatbot can be manipulated to aid cybercriminals.

As it is, cybercrimes were already widespread before the arrival of ChatGPT, with malicious hackers implementing humanlike and hard to detect tactics into their attacks. In fact, the cost of cybercrimes is expected to hit US$8 trillion this year, and will grow to US$10.5 trillion by 2025, according to a report by Cybersecurity Ventures.

Singapore has also seen its fair share of cyber attacks, which were especially rampant earlier last year, with customers of Singaporean banks such as OCBC Bank and DBS Bank being hit with phishing scams.

These cybercrimes are rampant in the Web3 space as well, and even Singaporean celebrities such as Yung Raja, have fallen prey to these scams.

With the advent of ChatGPT, cybercriminals can now perfect their attack strategies. Although OpenAI has implemented some measures to prevent the abuse of the AI-enabled chatbot — such as its content policy which prohibits harmful and inappropriate content to be circulated on the platform — it doesn’t take much to sidestep these measures.

ChatGPT’s response to generating phishing emails/ Image Credit: Check Point Research

In fact, researchers at Check Point Research and Abnormal Security have spotted malicious hackers utilising ChatGPT to develop low level cyber tools such as basic data stealing tools, malware and encryption scripts.

These encryption tools can be easily turned into ransomware once a few minor problems were fixed, the researchers noted.

What’s more is that advanced and experienced hackers have also been utilising the AI-powered platform to provide guidance to those who are just starting out.

For instance, the researchers from Check Point Research discovered that a “tech orientated” hacker has been aiding “less technicably capable cybercriminals to utilise ChatGPT for malicious purposes”. This is mostly done by creating Telegram bots linked to OpenAI’s API, which are advertised in hacking forums.

Hackers utilising Telegram bots to create phishing messages/ Image Credit: Check Point Research

As more cyber criminals utilise ChatGPT’s platform and its ability to generate human-like responses, it is becoming a growing cause for concern — especially when it comes to phishing emails and messages, which make up the majority (80 per cent) of cybercrimes.

Before the rise of ChatGPT, hackers would usually duplicate an email or message template before sending them out in masses. These are generally easier to block out by email and message systems.

With AI-generated messages, however, blocking them out may be more difficult as the generated content will look more “human” and each email can be generated to look and sound distinct from each other.

AI technologies like ChatGPT are still evolving

There are many reasons to be enthusiastic about ChatGPT, but the AI chatbot has some crucial issues that should not be ignored.

It is hard to anticipate the potential challenges that may arise with cutting-edge technology, just like it has been with the metaverse, hence, it is essential to tread carefully when using ChatGPT.

However, the future of generative AI is promising and can provide us with tools to overcome traditional challenges, as well as unlock new possibilities.

The launch of ChatGPT might just be the beginning. We may be in a similar position as we were decades ago when the internet was in its infancy — brimming with potential, but still far from reaching its full potential.

Featured Image Credit: HackRead

Also Read

ChatGPT can now access the Internet with new OpenAI plugins like Expedia, Shopify and Slack