The insanity of talking computers and Microsoft Chatbots
Take a time travel trip with me to 1992 – date and time uncertain. Location is Nyanza Province, Homa-bay District. Two subjects, my mother and I, are walking from a nearby primary school to a far away primary school. On their way they encounter a middle aged, tall, thin badly groomed man. He is talking to himself. The ten year old me asks mother, “Why is that man talking to himself?” Mother replies, “Because he is insane”. Let us call this Insanity 1.0.
Fast forward and our stop in time is 2002 – again date and time uncertain. Location is Nairobi Central Business District. Two subjects, my grandmother and I, are walking from Kenya Bus Station to Odeon Cinema. They have just arrived from Nyanza and it is the first time grandmother is visiting Nairobi. There are many people in the streets and from a distant the subjects can see a middle aged man, one pack build. He is in a suit, a decent suit, and he is holding a device against his ear, talking. My grandmother comments, “Maajabu. The insane of Nairobi wear suits? Decent suits?”. Let us call this Insanity 2.0.
Insanity 2.0 is understandable since those who seem to be talking and shouting to themselves, stopping now and then to look back or point fingers at buildings on their left, are in reality talking to another human albeit miles out of sight. I particularly like those who when talking on phone use vigorous body language to show the guy on the other end of the phone what they actually mean. I forgive them. But have you been this insane person who responds to conversations that a friend is having on phone thinking that they are talking to you?
Insanity 3.0 is brewing. It is the Insanity that will usher in Javis (You know him not? Then watch Iron Man franchise). Siri, Google Now and Cortana are the brewers. Cortana has decided to take the lead, at least according to revelations made by Microsoft at the 2016 Microsoft Build for developers. But before we delve into this, let me ask you question, do you talk to your phone regularly? If yes, how does that feel especially in public places like restaurants, pubs and in matatus? If no, why?
I am yet to encounter a person who opens Google Now (the most common digital assistant in Kenya) to ask, “What is the whether in Kapenguria looking like?” “How far is the nearest MPESA shop?” Or to say, “Send a text message to Boss and tell him I am on my way but stuck in traffic”. Or ask of Google Now to show them the walking direction from current location to Uchumi Plaza. Nairobians still prefer to stop strangers in the streets and ask for direction. There are two answers to this; either it is that Kenyans still don’t know that Google Now can resolve most of their weather and location based nightmares, or that they fear the awkwardness of Insanity 3.0. I do use Google Now regularly, I do not use the voice input. For one, voice input will most of the time require me to repeat myself especially if I speak to her in a noisy public place. Second, I do not want to be associated with any type of insanity other than Insanity 2.0.
As much as many Kenyans and I believe most humans across the globe are scared of Insanity 3.0, Microsoft thinks that talking computers and computer applications are the way to go, and she wants to be the leader in it. Microsoft has lagged behind smartphones and Apps because she wasn’t the creator of those ecosystems, neither did she join the ecosystems at an early stage. If she is not careful, smartphones and apps are likely to drive her out of business. To remain relevant, Microsoft has identified a new area of computation that she thinks is the next ecosystem – the new method humans will use to interact with computers. She calls it conversational intelligence and with it she is creating the Microsoft Chatbots, a new ecosystem she believes will give her an edge over competition. And I agree. One immediate area Microsoft is applying the Microsoft chatbots platform is in empowering the blind.
We with sight take nature and events in it for granted. We walk down the streets seeing all manner of colours in buildings and dresses. We see faces both happy and sad pass us by, or look back at us with a frown or a smile, but hardly do we pay any serious attention to the implications of what we see. We are not to blame. But blind people walk around without any idea that the face that just passed by is their ideal face for an opposite sex. That the guy who just rubbed their shoulder is wearing the latest fashionable shoes in town, or the person that helped him cross the street has a million and one problems written all over his face.
Thanks to one of the Microsft Chatbots, this blind person can today appreciate a little bit of what is happening around him (watch the YouTube video at the end of this article to see how). He can be told that on sight are three faces holding a conversation, and ahead are five people approaching. He can be told that directly to his right is a green trash pit, and across the street there is a building painted yellow. When he steps into a restaurant and given a menu meant for those with sight, he is no longer required to identify himself as blind as the glass he is wearing in his face can read for him the menu items and corresponding prices.
And it is going to be better. Microsoft, Google, Apple and the others that work in the fields of artificial intelligence and smart gadgets are already creating conversational tools that will help forget about language barrier whenever we travel to foreign countries. Through the use of earbuds embedded with microphones (the type you see in movies), you will be in a position to visit Germany, meet a random guy in the street, talk to him in English, and the guy will hear you in German. He in turn will respond in German but you will hear him in English, in real time. Although it will take tens of years for a two way hours long conversations to be maintained using these real time language translations, basic direction or general information enquiries are already being exchanged through the existing language voice based translations through dedicated Apps in our smartphones.
The new conversational intelligence technologies is taking a shift where the conversational intelligence is the new platform or ecosystem, day to day ordinary language become the new User Interface (UI), Digital Assistants like Cortana become the new browsers, and the bots like Microsoft Chatbots that ride on the Digital Assistants become the new Apps. In an application like Skype for instance, Microsoft Chatbots are ready to help with flight, taxi and hotel bookings, and developers are being encouraged to create as many bots as possible for the Microsoft platforms especially Skype and Outlook. If Microsoft gets the Microsoft Chatbots right, despite the setbacks she has encountered with the racist Tay, then a new world of communication shall have been created where Microsoft becomes the leader.
Once these conversational intelligence platforms are perfected, then it will no longer appear awkward to walk around the streets and meet people talking, not to themselves or to other humans at the other end of the line, but to computers. Humans will have conversations with digital assistants telling them to write and send emails on their behalf, telling them to remember and prioritize appointments they have that day, and to remember to send that self driving car to pick their kids from school. Once Insanity 3.0 matures, everyone everywhere will be talking – but to talking computers.
However, before we are able to benefit from Insanity 3.0, we need to embrace it. We need to start talking to Google Now asking her to do for us the things she can already do. Next time you are lost in town, please remember to pull out Google Now and ask her to show you the direction. Make it an habit to be asking her about the weather and the latest news mentioning things like Uhuru Kenyatta and Raila Odinga. Next time you are not sure how your English Premier League team performed the night before, don’t type the query into Google Search – just speak out your question. The more we do this, the more we enable Google and Microsoft and the rest to perfect what Microsoft has come to call “Conversational Intelligence”. We want our smartphones to be more intelligent, don’t we? Then talk to them – please!