Disclaimer: Any opinions expressed below belong solely to the author.
Many people view generative AI models with suspicion, particularly those dealing with visual data that is able to learn from millions of images, under unclear interpretation of copyright laws.
However, the very same solutions can be used not only to generate images or video from other visual inputs, but translate them across seemingly incompatible sources — like MRI scans of brain activity.
This is what researchers at the National University of Singapore (NUS), in collaboration with colleagues from the Chinese University of Hong Kong, have been working on for many months now, improving on the subsequent iterations of their very own AI models.
Using Stable Diffusion — the largest and most popular open-source image-generating AI model — as a basis, they have developed MinD-Vis, their own custom model which learns to translate brain activity into images by being fed information about thousands of images presented to a study participant, linking them to areas of the brain tracked and recorded during the MRI scan.
In the process, it creates a personalised profile for each person, enabling it to quite literally see what they are seeing based on brain activity alone.
In an extension of the project — MinD-Video that was published in May — the team has also demonstrated an ability to use a similar approach to generate videos of what the person is thinking of at the moment with up to 85 per cent accuracy.
“So next time you come in, you will do the scan and in the scan, you will see the visual stimuli like this. And then we’ll record your brain activities at the same time.
Your brain activities will go into our AI translator and this translator will translate your brain activities into a special language that a Stable Diffusion can understand, and then it will generate the images you are seeing at that point. So that’s basically how we can read your mind in this sense.”
– Jiaxin Qing, Department of Information Engineering, The Chinese University of Hong Kong for CNBC
Caveats?
Of course, we’re still far from being able to read other people’s minds (not to mention entire groups or societies). You need to lie down in an MRI machine and have your own AI profile trained on the basis of your brain readings first.
We also have to bear in mind that the output images are, at the moment, just an approximation, which may differ in details, colours, placement of some objects and so on.
However, it is a pretty robust proof-of-concept, showing that much information can be extracted from little else than observing our brain activity and AI models are only bound to get more accurate with time.
Most importantly, however, the first people to really benefit from the technology are likely those who need it the most — bed-ridden, quadriplegic patients, suffering from extremely limited means of communicating with the surrounding world.
Absolute accuracy is not necessary to dramatically improve their lives.
Translating brain activity into communication — whether verbal or visual — has the potential to improve their well-being overnight, opening possibilities thus far remaining in the sphere of futuristic visions.
Going forward, it’s also clear that translational functions of these AI models do not have to be limited to providing a way to communicate with other humans, but also to send signals to prostheses and medical/mobility equipment, allowing people thus far chained to their beds to finally regain some sort of motion and ability to interact with the environment around them.
At NUS, AI is learning to read our minds — not to control us, but to understand us better — propelling us towards technological capabilities envisioned in Robocop more than 1984. Fortunately.
Featured Image Credit: NBC News