It’s nearing the end of 2016, and with it comes a time of reflection on our year – the ups, the downs, and whether or not we accomplished the New Year’s Resolutions we imposed on ourselves at the start of the year.
If, like me, you’re severely lacking in discipline, the end-of-year reflection only calls for grovelling in self-loathing, but that’s not the case if you’re a successful, driven human being like Mark Zuckerberg.
In January, he declared that his challenge for 2016 was to “build a simple AI to run [his] home and help [him] with [his] work […] like Jarvis in Iron Man”.
Even if you’re not a Marvel fanboy/girl, a viewing of any movie from the Marvel Cinematic Universe with Iron Man would have introduced you to Jarvis, the AI assistant which Tony Stark (aka Iron Man) uses to make his daily life as an entrepreneur and ass-kicking superhero much easier.
Jarvis (an abbreviation for “Just A Rather Very Intelligent System”) acts like both an assistant and butler to Stark, fetching him items, managing his mansion, controlling his robotic appliances and armours, and even helps him with research.
In fact, the characterisation of Jarvis blurs the line between AI and humans, because it is able to interpret, and thus interact like an actual living person.
And it looks like Zuckerberg wants to make that a reality.
Just a few hours ago, Zuckerberg posted a lengthy Facebook note, detailing his progress on his version of Jarvis in the past year.
He mentions that the AI can now control his home, “including lights, temperature, appliances, music and security”; has “[learnt his] tastes and patterns”, and “can learn new words and concepts, and even entertain Max (his daughter)”.
The AI is said to use “several artificial intelligence techniques, including natural language processing, speech recognition, face recognition, and reinforcement learning, written in Python, PHP and Objective C”.
But along the way, Zuckerberg also met with a roadblock – not knowing all the skills needed in its creation.
In exclusive interview with FastCompany, he reveals that he “didn’t go through a formal Bootcamp”, an intensive programme for new engineering hires at Facebook to help them learn about the company’s code base and programming tools.
While he has worked on “small programming projects in his rare spare time [and] participated in several company hackathons over the years”, creating an intelligent AI software is an entirely different ballgame.
Given that he’s probably one of the busiest people on this planet, learning everything from scratch wasn’t an option as well – however, he admitted that he has gotten help from those around him with the expertise.
“But when I ask people questions, you can imagine that they respond pretty quickly.”
A TL;DR Of Zuckerberg’s Note
The note’s pretty lengthy, and some terms do elude many, so we decided to do a TL;DR of what he had accomplished, and realised in the past year while working on Jarvis.
Connecting To Existing Systems, Non-IoT Appliances
“Before I could build any AI, I first needed to write code to connect these systems, which all speak different languages and protocols.”
Due to this, Zuckerberg had to reverse engineer APIs for some of his systems (e.g. Crestron for lights, thermostat and doors, Facebook for his work) to ensure that a command from his computer could be successfully issued.
“Most appliances aren’t even connected to the internet yet. It’s possible to control some of these using internet-connected power switches that let you turn the power on and off remotely. But often that isn’t enough.”
While IoT-connected and Smarthome appliances are becoming more common, many are still very much offline. Using the example of a toaster, he got around the problem by rigging up an old toaster from the 1950s with a connected switch.
He also added the necessity for more devices to eventually be connected by common APIs.
“For assistants like Jarvis to be able to control everything in homes for more people, we need more devices to be connected and the industry needs to develop common APIs and standards for the devices to talk to each other.”
Helping It Interpret “Natural Language” aka Speech Patterns And Context
“This was a two step process: first I made it so I could communicate using text messages, and later I added the ability to speak and have it translate my speech into text for it to read.”
First starting with keywords like “bedroom” and “lights”, he soon realised the need for the AI to learn synonyms and understand context.
“Understanding context is important for any AI. For example, when I tell it to turn the AC up in “my office”, that means something completely different from when Priscilla tells it the exact same thing. That one caused some issues!”
Making It An Intelligent Personal DJ
“Music is a more interesting and complex domain for natural language because there are too many artists, songs and albums for a keyword system to handle. […] Consider these requests related to Adele: “play someone like you”, “play someone like adele”, and “play some adele”. Those sound similar, but each is a completely different category of request.”
“Through a system of positive and negative feedback, an AI can learn these differences.”
Zuckerberg reveals that with this system, Jarvis is now mostly able to nail down what he wants to hear at any specific point in time.
Improving Vision And Face Recognition
Recognising that “many important AI problems [are] related to understanding what is happening in images and videos”, he noted that Facebook has actually “gotten very good at face recognition for identifying when your friends are in your photos”, and decided to adopt that expertise for the AI.
“I built a simple server that continuously watches the cameras and runs a two step process: first, it runs face detection to see if any person has come into view, and second, if it finds a face, then it runs face recognition to identify who the person is. Once it identifies the person, it checks a list to confirm I’m expecting that person, and if I am then it will let them in and tell me they’re here.”
The system is also used to identify when his daughter is awake, “so it can start playing music or a Mandarin lesson”, or even solve the “context problem of knowing which room in the house we’re in so the AI can correctly respond to context-free requests like “turn the lights on” without providing a location”.
Making It Accessible Anywhere Via A Messenger Bot
“I programmed Jarvis on my computer, but in order to be useful I wanted to be able to communicate with it from anywhere I happened to be. That meant the communication had to happen through my phone, not a device placed in my home.”
Admitting it was “much easier” to build a Messenger bot to communicate with Jarvis as compared to a dedicated app, he can text commands and even send audio clips which can be translated into commands to Jarvis even when he’s not at home.
Not just one-way, Jarvis can also send an image of whoever’s at the door, or send a text reminder of to-dos.
Zuckerberg also noted that he seems to have texted more than he expected, mostly because “it feels less disturbing to people around [him]”. Relating the preference of text over voice communication to the patterns he has been observing on Messenger and Whatsapp, he states that “this suggests that future AI products cannot be solely focused on voice and will need a private messaging interface as well”.
Voice And Speech Recognition – Making It Understand Conversational Speech
Regardless of the preference of communication via text, he finds that using speech/voice still trumps in terms of convenience and speed.
“You don’t need to take out your phone, open an app, and start typing — you just speak.”
Having built a dedicated Jarvis app that just listens continuously to what he says, and putting a number of phones with the app around his home so that it was accessible everywhere, he draws a difference between his Jarvis and Amazon’s Echo.
“That seems similar to Amazon’s vision with Echo, but in my experience, it’s surprising how frequently I want to communicate with Jarvis when I’m not home, so having the phone be the primary interface rather than a home device seems critical.”
Just like how Tony Stark banters with Jarvis, Zuckerberg also built in an element of humour into the AI, as he noticed that “once you can speak to a system, you attribute more emotional depth to it than a computer you might interact with using text or a graphic interface”.
According to Zuckerberg, it can now interact with his whole family, play games with them, and even dish out “classic lines like “I’m sorry, Priscilla. I’m afraid I can’t do that.”“
However, he also realised that while “speech recognition systems have improved recently, no AI system is good enough to understand conversational speech just yet”, but adds that “there’s a lot more to explore with voice”.
Making Jarvis More Intelligent, And The Future Of AI
Ending off the note with the assurance that he will continue to improve Jarvis since he’s constantly using it and always finding new things to add, he aims, in the longer term, to teach Jarvis “how to learn new skills itself rather than [him] having to teach it how to perform specific tasks”.
As of now, Jarvis is still his pet project, and its connection to his home has made him reject the idea of open sourcing the code.
However, he also mentions that “over time it would be interesting to find ways to make this available to the world”.
“In a way, AI is both closer and farther off than we imagine. AI is closer to being able to do more powerful things than most people expect — driving cars, curing diseases, discovering planets, understanding media.”
“Those will each have a great impact on the world, but we’re still figuring out what real intelligence is.”
The Darker Side: When AI Gets Too Intelligent
It’s clear that Zuckerberg has placed much faith in AI, and is trying to advance its progress on a personal (and eventually, public) level, but not all technopreneurs like himself are that optimistic.
For example, Tesla CEO and the Justin Bieber-equivalent of tech idols Elon Musk, is famously hesitant about AI.
Said Musk in an interview with Fortune, “I think that the biggest risk is not that the AI will develop a will of its own, but rather that it will follow the will of people that establish its utility function.”
“If it is not well thought out—even if its intent is benign—it could have quite a bad outcome. If you were a hedge fund or private equity fund and you said, ‘Well, all I want my AI to do is maximise the value of my portfolio,’ then the AI could decide, well, the best way to do that is to short consumer stocks, go long defense stocks, and start a war.”
If that sounds familiar, that’s because the potential was demonstrated in the movie, Avengers: Age of Ultron (2015).
For those who haven’t watched the movie, the main ‘villain’ in the movie was an AI, Ultron, which took on the fulfilment of its master’s (Tony Stark) intentions of saving Earth in the most extreme way ever – by eradicating humanity.
Not just having a will of its own, the “unexpectedly sentient” Ultron defies all orders and is hell-bent on an order with intentions that were, as Musk would put it, “benign”.
While Zuckerberg’s Jarvis, and what we have of AI is currently far from even having a will of its own, denying the potential of the situation that Musk has painted is simply delusional.
There’s much to debate over on how ‘intelligent’ AI can, and should be, and it’ll be one that won’t be ending soon.
Feature Image Credit: Forbes