Sridhar Mahadevan's Blog on Machine Learning

August 06, 2018

What Machine Learning Cannot (Yet) Do?

It's tremendously important in this era of frenzied speculation about AI to understand what the limitations of machine learning are.

First, and foremost, it is important to realize that ML is not omnipotent. As far as we know, from research over the past 50 years, we have come to realize that ML is a mode of knowledge acquisition without explicit programming that has some definite boundaries. Just like computation as a process has inherent limitations — for example, it is not possible now, nor will it ever be possible in the future, to decide if an arbitrary program will halt — ML also has intrinsic limitations that cannot be overcome by throwing more GPU machines at the problem, or using faster computers. I know this news might come as a disappointment to many fans of ML, but it is important to know that ML is not the salvation to all our problems. There are other ways to acquire information, and people incidentally use these all the time, in addition to “learning”.

Let us take as a primary example the problem of learning a language from hearing verbal utterances. This problem has attracted deep interest over 50 years or more in not only AI, but also in philosophy, linguistics, psychology, biology, and neuroscience, to name a few fields. Well, guess what? We still don’t understand how humans, by which I mean children as early as 2 years old, acquire their first language. There has been a tremendous amount of work on documenting the process, and of course, many theories. But, you can’t go to Best Buy today and buy a learning machine (e.g., say an Alexa) that will sit in your house and simply hear whatever language is being spoken in your house, and within a year or two, start conversing with you. Isn’t this sad? I mean, for all the millions of servers that Google, Amazon, Microsoft, and the big tech companies have at their resources, and the huge number of petabytes of storage capacity in data centers, we can’t solve this problem!

No, chatbots don’t learn language, and if you have ever used a chatbot, you'll see easily in a minute or two why it can’t be the answer. Now, you probably have heard about the impressive power of deep learning solutions, like long short-term memory (LSTM) architectures, or generalized recurrent unit (GRU) architectures at doing tasks like language translation. Well, once again, these systems are far from being able to learn language, and even their performance at language translation is currently woefully bad compared to humans.

For an absolutely devastating takedown of Google Translate, I highly recommend the insightful article in Atlantic magazine by Doug Hofstadter, one of the deepest thinkers in AI and cognitive science, whose breakthrough book on “Godel, Escher, Bach: An Eternal Golden Braid” got me into AI in the first place:

The Shallowness of Google Translate

Now, this is not to say that Google Translate is not extremely valuable and useful. Indeed, it is used everyday by millions of people worldwide, just as Alexa and its technological variants are used everyday by many people. But, systems like GT are no match for humans, and one has to only see the many examples in Hofstadter’s article to see how far AI has to go in truly understanding language. LSTM and GRU architectures don’t “understand” language — they build simple statistical models that retain some information about past words, mostly at the sentence level, and they can easily be defeated, as Hofstadter’s revealing article shows so vividly. In one of the examples, GT is asked to translate a paragraph from English to French that has phrases like “he has his car, she has her car” and so on, illustrating how two people, a man and a woman, have divided their possessions in their house. GT entirely misses the point of the paragraph in its attempt to translate, and ends up not realizing the importance of gender.

So, what are the limitations of machine learning, using this example of learning language? There are principally two limitations, which are just inherent to the way ML is formulated now, and which cannot be overcome by throwing more data or compute power at the problem. Just as the halting problem is undecidable, now and forever, so are these two limitations of machine learning. This is why it is important to know such things, so that one realizes what one can and what one cannot do with machine learning. As the famous Chinese philosopher Confucius said a long time ago: “a man who knows what he knows and knows what he does not know is one who truly knows”. So, an ML researcher who understands the inherent limitations of ML is one who in my book at least, “truly knows” ML.

Limitation number one was proven in a famous theorem by Gold 50 years or so ago. It turns out that many studies have shown that children primarily receive positive examples of natural language. By and large, parents do not correct children’s mispronunciations or grammatically incorrect phrases, but instead interpret what the child is trying to say. So, unlike the somewhat idealized case of say supervised image labeling, where one gets images of faces and non-faces, children only get positive examples. Children also have no idea what the language they are supposed to learn is (if you are born in the US, you are not equipped with some magic “English learning” genes). My brother-in-law raised his three daughters in Japan (he is Indian, his wife is American and a Caucasian), and all his three daughters learned to speak Japanese fluently by the time they were 3 or 4 years old. Even today, when they are in their 20s, they prefer to speak with each other in Japanese. Such is the power of first language acquisition by a child — it literally shapes the mind for the child’s entire future.So, what Gold proved is this: no matter how many positive examples you see, a machine learning system can never infer a context-free grammar that generates the strings in the language. That is, assume you are given strings generated by some unknown context-free language. No matter how many strings you see, and how much compute power you have available, there will never come a time when you can say you have exactly identified the grammar that generated the language. This was truly a stunning result. Since Japanese and English and German and French are all more powerful than simply context-free languages, it must mean that the space of languages in our brains is not all context-free or all context-sensitive, but some other more restricted class that is purely identifiable from positive only examples. What is this class? Linguists have been looking for 50+ years, and haven’t found it yet, although there has been a lot of progress.

Einstein was a big believer in the power of imagination, and specifically in the idea of gedankenexperiment (where you construct a hypothetical scenario, a “thought experiment”). So, to understand the limitation of machine learning models, like deep learning, let us do a gedankenexperiment. When I was growing up, and learning my first language, like many other children, I was in a house with pets (a dog and a cat). Well, dogs and cats have highly sophisticated brains, with billions of neurons. The visual system of the cat has often been used to test theories of human vision, since its visual cortex is very similar in many ways.

So, if we think of the brains of cats and dogs, we can say they are “deep learning” architectures that are hugely sophisticated in their complexity. Yet, and here comes the power of gedankenexperiment, my dog and cat heard pretty much every utterance in Tamil — the language spoken in South India in my house where I was growing up — and yet, even though in 2 years or so, I became fluent in Tamil, neither my dog nor my cat ever showed the ability to speak. So, remember, it is not that they didn’t have the data, clearly they did. It is not that they don’t have a massively parallel deep learning arechitecture, that in other areas like perception is as good or better than humans. They were simply lacking in the ability to turn verbal utterances into language. So, for all the enthusiasm and energy that LSTM and GRU deep learning researchers bring to the table, and I for one applaud their efforts at building useful artifacts like GT, it is worth noting that the solution to language learning is not going to simply emerge out of some deep learning network. It is a harder problem than that, and it is going to require much more thorough understanding of language of the type that linguists have been doing for 50 years.Now, for the second limitation, and this has to do with an inherent limitation of the two foundations of ML today, probability and statistics. Now, both of these mathematical areas are incredibly powerful and useful, not just in ML, but also in many other areas of science and engineering. It is hard to argue with the statement that Fisher’s work on randomized experiments and maximum likelihood estimation was one of the pinnacles of research in the 20th century, one that made many other things possible (e.g., reliable engineering of many technological artifacts, and drug testing etc.).

As Neyman, Pearson, Rubin, and most recently Pearl, have shown, however, statistical reasoning is inherently limited. Probability theory cannot be used to reveal the causal nature of the world. It cannot be used to learn that “lightning causes thunder”, not the other way around, or that “diseases cause symptoms”. Such an elementary bit of reasoning cannot be achieved by probability or statistics, or its derivative field, statistical ML. Once again, this is an inherent limitation, one that cannot be overcome by more data, more machines, and more money being thrown at the problem.

So, at the end of the day, one has to come to the realization that data science, despite all its promise and all its potential power, is not the end of the story. It will not be the miracle solution to the problem of AI, and to solve the problem of language learning and the problem of causal discovery from observations, one has to develop additional tools. Pearl and Rubin, for example, have developed just such extension of probability theory, such as potential outcome theory and the do-calculus operators. Pearl’s latest book on “Why?” is highly recommended. It has a three level cognitive architecture, with statistical modeling from observation at the lowest layer, causal reasoning with interventions at the middle layer, and imaginative reasoning with counterfactuals at the top layer. This is one of the most interesting recent ideas on how to extend data science to what I call “imagination science”, a field that doesn’t yet exist, but one that I believe will become more popular over the coming decades as the limitations of data science become more obvious.That is not to say that data science is not useful, it is in fact tremendously useful, and one can use it to model many phenomena, from social networks (we all know where this story leads to) to medical diseases and social problems like gun violence in schools. However, and this is crucial to understand, data science does not tell you how to solve these problems! Yes, gun violence in schools is an abhorrent stain on the otherwise marvelous educational environment in the US, and one can use data science and deep learning to construct elaborate models that summarize the incidents of gun violence. But, that’s not the real issue, is it? The real issue is intervention! How to reduce or eliminate gun violence? As Pearl argues, understanding interventions is not statistics. Probability distributions, by their very nature, do not contain within themselves some recipe that tells you how they change as you intervene in the world. We all know the interventions that are being proposed to reduce gun violence: ban the sales of assault rifles, better background checks on prospective gun buyers, equip teachers with guns (the US President appears to favor this intervention), and in fact, even repealing the 2nd amendment has been supported by one former US Supreme Court Justice. All of these are “interventions”: they will change the distribution of gun violence in some way. Which one is the most effective intervention? That is the real issue, and sadly, data science will not answer this question, since it requires causal models (layer 2 of Pearl’s cognitive architecture).

Understanding interventions is at the heart of not just reducing gun violence, but also many other problems facing society today. Take climate change. We can collect massive amounts of data on global warning and use deep learning to construct sophisticated models of CO2 emissions etc. But, again, what is the hard question here is what intervention is needed? Should we phase out gasoline powered cars and trucks entirely, and if so, at what rate? How much time will that buy us? There are scary looking predictions that look at what the map of the US will look like in 10,000 years (an imagination problem, of course!). This study was recently published in the New York Times:

Can You Guess What America Will Look Like in 10,000 Years? A Quiz

So, the consequences of global warming are indeed quite alarming and ultimately threaten our very survival as a species. So, the question is what to do about it? What interventions make the most sense, and how should they be implemented. Note this is not data science! When you intervene (say a city like Beijing or London decides to impose new traffic regulations and allow only even numbered license plates inside the city one day and odd-numbered plates the next day), you change the underlying data distribution from what it is currently, and so, all your previous data is useless!So, causal models are absolutely needed to understand a vast array of social challenges that are going to become ever more pressing in the 21st century. If AI is going to contribute towards the betterment of society, its very effectiveness will depend on the ability to which researchers in the field understand the inherent limitations of the current most dominant paradigm, statistical ML, and why we as a field, and why, we as a society, need to move on to more powerful paradigms. Our very existence as a species may depend on developing the next AI paradigm that is more powerful than data science.

Comments

Sadhana RathoreFebruary 27, 2019 at 4:35 AM
I am glad that I have visited your blog, really amazing. Waiting for further updates.
Machine Learning Course in Chennai
Machine Learning Training in Chennai
Data Science Training in Chennai
Data Science Course in Chennai
Data Analytics Courses in Chennai
R Programming Training in Chennai
Data Science Training in Velachery
Machine Learning Training in Chennai
ReplyDelete
Replies
KarthikMarch 27, 2019 at 12:08 AM
Your blog has very useful information about this technology which i am searching now, i am eagerly waiting to see your next post as soon
Data science training in chennai
Data science course in chennai
Data science training in Anna nagar
Data science training in Adyar
Data science training in T Nagar
Cloud computing courses in chennai
Cloud computing training in chennai
Cloud computing training in Tambaram
ReplyDelete
Replies
Anjali SivaMay 19, 2019 at 5:22 AM
Your blog is interesting to read, thanks for sharing this and keep update your blog regularly.
Cloud computing courses in Chennai
Cloud computing courses
Cloud Training in Chennai
AWS course in Chennai
AWS Certification in Chennai
DevOps Training in Chennai
Best DevOps Training in Chennai
AWS Training in Anna Nagar
ReplyDelete
Replies
DharmaMay 31, 2019 at 12:40 PM
Without learning coding can't we make our world a better artificial intelligence place like self ......................
ReplyDelete
Replies
anushyaJune 26, 2019 at 12:21 AM
Excellent article and interesting one to read. I am very glad to see this kind of article. Thanks for sharing.
Ethical Hacking Course in Bangalore
Hacking Course in Bangalore
AWS Training in Bangalore
Devops Training in Bangalore
Python Training in Bangalore
Data Analytics Training in Bangalore
Digital Marketing Training in Bangalore
Python Course in Bangalore
ReplyDelete
Replies
Kale Co JakimDecember 3, 2019 at 5:28 AM
Great Article
Data Mining Projects

Python Training in Chennai

Project Centers in Chennai

Python Training in Chennai
ReplyDelete
Replies
BestJuly 11, 2020 at 4:18 PM
Dr Love is a certified teacher and has authored three self-learning courses and coauthored six more. machine learning and artificial intelligence courses in hyderabad
ReplyDelete
Replies
William JessieJuly 27, 2020 at 11:21 PM
I read your post and got it quite informative. I couldn't find any knowledge on this matter prior to. I would like to thanks for sharing this article here. Professional Translation Online Company
ReplyDelete
Replies
AnonymousDecember 28, 2020 at 6:39 AM
This comment has been removed by the author.
ReplyDelete
Replies
Laura BushDecember 29, 2020 at 2:47 AM
Nice Post For me and It is a very different blog than the usual ones I visit. From this post, I get more knowledge and I read a lot of interesting content here. Thanks for sharing a knowledgeable post. Best Translation Companies Saudi Arabia
ReplyDelete
Replies
AnonymousDecember 29, 2020 at 10:39 AM
Machine learning system is highly responsible for whole world development. But maximum person don't know what is machine learning? If you are a machine learner you can get more idea about this system from Thetechnews website.
ReplyDelete
Replies
Laura BushJanuary 4, 2021 at 7:44 PM
This comment has been removed by the author.
ReplyDelete
Replies
translationhelpdeskJanuary 4, 2021 at 7:52 PM
I am glad to say that this is a very useful post. superb and intresting.one of the best post.this inspired a new blog. Thanks for sharing wonderful information . Translation Project Management Services USA
ReplyDelete
Replies
BITA AcademyJanuary 19, 2021 at 2:11 AM
This blog is educative, glad to find this. artificial intelligence training in Chennai
ReplyDelete
Replies
DwayneMarch 22, 2021 at 10:40 PM
This comment has been removed by the author.
ReplyDelete
Replies
LEARNER FUNDAMarch 25, 2021 at 12:17 AM
It's very nice of you to share your knowledge through posts. I love to read stories about your experiences. They're very useful and interesting. I am excited to read the next posts. I'm so grateful for all that you've done. Keep plugging. Many viewers like me fancy your writing. Thank you for sharing precious information with us. Best machine learning with python training in hyderabad service provider.
ReplyDelete
Replies
DwayneJune 12, 2021 at 11:51 PM
I like the work which you have placed in this article. This data is significant and superb. I might want to thank you for sharing this article here. saudi arabia translator
ReplyDelete
Replies
linguistpointJune 14, 2021 at 2:09 AM
This is really a good source of information, I will often follow it to know more information and expand my knowledge, I think everyone should know it, thanks Best Arabic Translation Services service provider.
ReplyDelete
Replies
UnknownAugust 23, 2021 at 5:41 AM
ucuz takipçi
ucuz takipçi
tiktok izlenme satın al
binance güvenilir mi
okex güvenilir mi
paribu güvenilir mi
bitexen güvenilir mi
coinbase güvenilir mi
ReplyDelete
Replies
Virtualelingua ltdNovember 13, 2021 at 10:23 AM
A very delightful article that you have shared here.Online Language Translation Agency Ireland Your blog is a valuable and engaging article for us, and also I will share it with my companions who need this info. Thankful to you for sharing an article like this.
ReplyDelete
Replies
CXcherryJanuary 24, 2022 at 10:39 AM
Great job for publishing such a nice article. Your article isn’t only useful but it is additionally really informative. Read more info about customer onboarding kpis. Thank you because you have been willing to share information with us.
ReplyDelete
Replies
philosphersMarch 24, 2022 at 2:36 PM
I see all the pictures you mentioned on your blog about artificial intelligence course. And that really impressed me. You have great knowledge of this. Please share more information with us
ReplyDelete
Replies
ConvexPathMarch 30, 2022 at 7:47 AM
Thanks for sharing the best information and suggestions, it is very nice and very useful to us. I appreciate the work that you have shared in this post. Keep sharing these types of articles here.Best computer vision course online
ReplyDelete
Replies
AnonymousJune 27, 2022 at 6:57 AM
lisans satın al
en son çıkan perde modelleri
uc satın al
özel ambulans
en son çıkan perde modelleri
yurtdışı kargo
nft nasıl alınır
minecraft premium
ReplyDelete
Replies
Immerse Languages InstituteSeptember 21, 2022 at 5:09 AM
This is excellent information that is shared by you This information is meaningful and magnificent for us to increase our knowledge about it. Keep sharing this kind of information. Thank you. Read more info about Cantonese Courses Hong Kong
ReplyDelete
Replies

Add comment

Search This Blog

Sridhar Mahadevan's Blog on Machine Learning

Comments

Post a Comment

Popular posts from this blog