What Machine Learning Cannot (Yet) Do?
It's tremendously important in this era of frenzied speculation about AI to understand what the limitations of machine learning are.
First, and foremost, it is important to realize that ML is not omnipotent. As far as we know, from research over the past 50 years, we have come to realize that ML is a mode of knowledge acquisition without explicit programming that has some definite boundaries. Just like computation as a process has inherent limitations — for example, it is not possible now, nor will it ever be possible in the future, to decide if an arbitrary program will halt — ML also has intrinsic limitations that cannot be overcome by throwing more GPU machines at the problem, or using faster computers. I know this news might come as a disappointment to many fans of ML, but it is important to know that ML is not the salvation to all our problems. There are other ways to acquire information, and people incidentally use these all the time, in addition to “learning”.
Let us take as a primary example the problem of learning a language from hearing verbal utterances. This problem has attracted deep interest over 50 years or more in not only AI, but also in philosophy, linguistics, psychology, biology, and neuroscience, to name a few fields. Well, guess what? We still don’t understand how humans, by which I mean children as early as 2 years old, acquire their first language. There has been a tremendous amount of work on documenting the process, and of course, many theories. But, you can’t go to Best Buy today and buy a learning machine (e.g., say an Alexa) that will sit in your house and simply hear whatever language is being spoken in your house, and within a year or two, start conversing with you. Isn’t this sad? I mean, for all the millions of servers that Google, Amazon, Microsoft, and the big tech companies have at their resources, and the huge number of petabytes of storage capacity in data centers, we can’t solve this problem!
No, chatbots don’t learn language, and if you have ever used a chatbot, you'll see easily in a minute or two why it can’t be the answer. Now, you probably have heard about the impressive power of deep learning solutions, like long short-term memory (LSTM) architectures, or generalized recurrent unit (GRU) architectures at doing tasks like language translation. Well, once again, these systems are far from being able to learn language, and even their performance at language translation is currently woefully bad compared to humans.
For an absolutely devastating takedown of Google Translate, I highly recommend the insightful article in Atlantic magazine by Doug Hofstadter, one of the deepest thinkers in AI and cognitive science, whose breakthrough book on “Godel, Escher, Bach: An Eternal Golden Braid” got me into AI in the first place:
Now, this is not to say that Google Translate is not extremely valuable and useful. Indeed, it is used everyday by millions of people worldwide, just as Alexa and its technological variants are used everyday by many people. But, systems like GT are no match for humans, and one has to only see the many examples in Hofstadter’s article to see how far AI has to go in truly understanding language. LSTM and GRU architectures don’t “understand” language — they build simple statistical models that retain some information about past words, mostly at the sentence level, and they can easily be defeated, as Hofstadter’s revealing article shows so vividly. In one of the examples, GT is asked to translate a paragraph from English to French that has phrases like “he has his car, she has her car” and so on, illustrating how two people, a man and a woman, have divided their possessions in their house. GT entirely misses the point of the paragraph in its attempt to translate, and ends up not realizing the importance of gender.
So, what are the limitations of machine learning, using this example of learning language? There are principally two limitations, which are just inherent to the way ML is formulated now, and which cannot be overcome by throwing more data or compute power at the problem. Just as the halting problem is undecidable, now and forever, so are these two limitations of machine learning. This is why it is important to know such things, so that one realizes what one can and what one cannot do with machine learning. As the famous Chinese philosopher Confucius said a long time ago: “a man who knows what he knows and knows what he does not know is one who truly knows”. So, an ML researcher who understands the inherent limitations of ML is one who in my book at least, “truly knows” ML.
Limitation number one was proven in a famous theorem by Gold 50 years or so ago. It turns out that many studies have shown that children primarily receive positive examples of natural language. By and large, parents do not correct children’s mispronunciations or grammatically incorrect phrases, but instead interpret what the child is trying to say. So, unlike the somewhat idealized case of say supervised image labeling, where one gets images of faces and non-faces, children only get positive examples. Children also have no idea what the language they are supposed to learn is (if you are born in the US, you are not equipped with some magic “English learning” genes). My brother-in-law raised his three daughters in Japan (he is Indian, his wife is American and a Caucasian), and all his three daughters learned to speak Japanese fluently by the time they were 3 or 4 years old. Even today, when they are in their 20s, they prefer to speak with each other in Japanese. Such is the power of first language acquisition by a child — it literally shapes the mind for the child’s entire future.So, what Gold proved is this: no matter how many positive examples you see, a machine learning system can never infer a context-free grammar that generates the strings in the language. That is, assume you are given strings generated by some unknown context-free language. No matter how many strings you see, and how much compute power you have available, there will never come a time when you can say you have exactly identified the grammar that generated the language. This was truly a stunning result. Since Japanese and English and German and French are all more powerful than simply context-free languages, it must mean that the space of languages in our brains is not all context-free or all context-sensitive, but some other more restricted class that is purely identifiable from positive only examples. What is this class? Linguists have been looking for 50+ years, and haven’t found it yet, although there has been a lot of progress.
Einstein was a big believer in the power of imagination, and specifically in the idea of gedankenexperiment (where you construct a hypothetical scenario, a “thought experiment”). So, to understand the limitation of machine learning models, like deep learning, let us do a gedankenexperiment. When I was growing up, and learning my first language, like many other children, I was in a house with pets (a dog and a cat). Well, dogs and cats have highly sophisticated brains, with billions of neurons. The visual system of the cat has often been used to test theories of human vision, since its visual cortex is very similar in many ways.
So, if we think of the brains of cats and dogs, we can say they are “deep learning” architectures that are hugely sophisticated in their complexity. Yet, and here comes the power of gedankenexperiment, my dog and cat heard pretty much every utterance in Tamil — the language spoken in South India in my house where I was growing up — and yet, even though in 2 years or so, I became fluent in Tamil, neither my dog nor my cat ever showed the ability to speak. So, remember, it is not that they didn’t have the data, clearly they did. It is not that they don’t have a massively parallel deep learning arechitecture, that in other areas like perception is as good or better than humans. They were simply lacking in the ability to turn verbal utterances into language. So, for all the enthusiasm and energy that LSTM and GRU deep learning researchers bring to the table, and I for one applaud their efforts at building useful artifacts like GT, it is worth noting that the solution to language learning is not going to simply emerge out of some deep learning network. It is a harder problem than that, and it is going to require much more thorough understanding of language of the type that linguists have been doing for 50 years.Now, for the second limitation, and this has to do with an inherent limitation of the two foundations of ML today, probability and statistics. Now, both of these mathematical areas are incredibly powerful and useful, not just in ML, but also in many other areas of science and engineering. It is hard to argue with the statement that Fisher’s work on randomized experiments and maximum likelihood estimation was one of the pinnacles of research in the 20th century, one that made many other things possible (e.g., reliable engineering of many technological artifacts, and drug testing etc.).
As Neyman, Pearson, Rubin, and most recently Pearl, have shown, however, statistical reasoning is inherently limited. Probability theory cannot be used to reveal the causal nature of the world. It cannot be used to learn that “lightning causes thunder”, not the other way around, or that “diseases cause symptoms”. Such an elementary bit of reasoning cannot be achieved by probability or statistics, or its derivative field, statistical ML. Once again, this is an inherent limitation, one that cannot be overcome by more data, more machines, and more money being thrown at the problem.
So, at the end of the day, one has to come to the realization that data science, despite all its promise and all its potential power, is not the end of the story. It will not be the miracle solution to the problem of AI, and to solve the problem of language learning and the problem of causal discovery from observations, one has to develop additional tools. Pearl and Rubin, for example, have developed just such extension of probability theory, such as potential outcome theory and the do-calculus operators. Pearl’s latest book on “Why?” is highly recommended. It has a three level cognitive architecture, with statistical modeling from observation at the lowest layer, causal reasoning with interventions at the middle layer, and imaginative reasoning with counterfactuals at the top layer. This is one of the most interesting recent ideas on how to extend data science to what I call “imagination science”, a field that doesn’t yet exist, but one that I believe will become more popular over the coming decades as the limitations of data science become more obvious.That is not to say that data science is not useful, it is in fact tremendously useful, and one can use it to model many phenomena, from social networks (we all know where this story leads to) to medical diseases and social problems like gun violence in schools. However, and this is crucial to understand, data science does not tell you how to solve these problems! Yes, gun violence in schools is an abhorrent stain on the otherwise marvelous educational environment in the US, and one can use data science and deep learning to construct elaborate models that summarize the incidents of gun violence. But, that’s not the real issue, is it? The real issue is intervention! How to reduce or eliminate gun violence? As Pearl argues, understanding interventions is not statistics. Probability distributions, by their very nature, do not contain within themselves some recipe that tells you how they change as you intervene in the world. We all know the interventions that are being proposed to reduce gun violence: ban the sales of assault rifles, better background checks on prospective gun buyers, equip teachers with guns (the US President appears to favor this intervention), and in fact, even repealing the 2nd amendment has been supported by one former US Supreme Court Justice. All of these are “interventions”: they will change the distribution of gun violence in some way. Which one is the most effective intervention? That is the real issue, and sadly, data science will not answer this question, since it requires causal models (layer 2 of Pearl’s cognitive architecture).
Understanding interventions is at the heart of not just reducing gun violence, but also many other problems facing society today. Take climate change. We can collect massive amounts of data on global warning and use deep learning to construct sophisticated models of CO2 emissions etc. But, again, what is the hard question here is what intervention is needed? Should we phase out gasoline powered cars and trucks entirely, and if so, at what rate? How much time will that buy us? There are scary looking predictions that look at what the map of the US will look like in 10,000 years (an imagination problem, of course!). This study was recently published in the New York Times:
So, the consequences of global warming are indeed quite alarming and ultimately threaten our very survival as a species. So, the question is what to do about it? What interventions make the most sense, and how should they be implemented. Note this is not data science! When you intervene (say a city like Beijing or London decides to impose new traffic regulations and allow only even numbered license plates inside the city one day and odd-numbered plates the next day), you change the underlying data distribution from what it is currently, and so, all your previous data is useless!So, causal models are absolutely needed to understand a vast array of social challenges that are going to become ever more pressing in the 21st century. If AI is going to contribute towards the betterment of society, its very effectiveness will depend on the ability to which researchers in the field understand the inherent limitations of the current most dominant paradigm, statistical ML, and why we as a field, and why, we as a society, need to move on to more powerful paradigms. Our very existence as a species may depend on developing the next AI paradigm that is more powerful than data science.
I am glad that I have visited your blog, really amazing. Waiting for further updates.
ReplyDeleteMachine Learning Course in Chennai
Machine Learning Training in Chennai
Data Science Training in Chennai
Data Science Course in Chennai
Data Analytics Courses in Chennai
R Programming Training in Chennai
Data Science Training in Velachery
Machine Learning Training in Chennai
Your blog has very useful information about this technology which i am searching now, i am eagerly waiting to see your next post as soon
ReplyDeleteData science training in chennai
Data science course in chennai
Data science training in Anna nagar
Data science training in Adyar
Data science training in T Nagar
Cloud computing courses in chennai
Cloud computing training in chennai
Cloud computing training in Tambaram
Your blog is interesting to read, thanks for sharing this and keep update your blog regularly.
ReplyDeleteCloud computing courses in Chennai
Cloud computing courses
Cloud Training in Chennai
AWS course in Chennai
AWS Certification in Chennai
DevOps Training in Chennai
Best DevOps Training in Chennai
AWS Training in Anna Nagar
Without learning coding can't we make our world a better artificial intelligence place like self ......................
ReplyDeleteExcellent article and interesting one to read. I am very glad to see this kind of article. Thanks for sharing.
ReplyDeleteEthical Hacking Course in Bangalore
Hacking Course in Bangalore
AWS Training in Bangalore
Devops Training in Bangalore
Python Training in Bangalore
Data Analytics Training in Bangalore
Digital Marketing Training in Bangalore
Python Course in Bangalore
Great Article
ReplyDeleteData Mining Projects
Python Training in Chennai
Project Centers in Chennai
Python Training in Chennai
Dr Love is a certified teacher and has authored three self-learning courses and coauthored six more. machine learning and artificial intelligence courses in hyderabad
ReplyDeleteI read your post and got it quite informative. I couldn't find any knowledge on this matter prior to. I would like to thanks for sharing this article here. Professional Translation Online Company
ReplyDeleteThis comment has been removed by the author.
ReplyDeleteNice Post For me and It is a very different blog than the usual ones I visit. From this post, I get more knowledge and I read a lot of interesting content here. Thanks for sharing a knowledgeable post. Best Translation Companies Saudi Arabia
ReplyDeleteMachine learning system is highly responsible for whole world development. But maximum person don't know what is machine learning? If you are a machine learner you can get more idea about this system from Thetechnews website.
ReplyDeleteThis comment has been removed by the author.
ReplyDeleteI am glad to say that this is a very useful post. superb and intresting.one of the best post.this inspired a new blog. Thanks for sharing wonderful information . Translation Project Management Services USA
ReplyDeleteThis blog is educative, glad to find this. artificial intelligence training in Chennai
ReplyDeleteThis comment has been removed by the author.
ReplyDeleteIt's very nice of you to share your knowledge through posts. I love to read stories about your experiences. They're very useful and interesting. I am excited to read the next posts. I'm so grateful for all that you've done. Keep plugging. Many viewers like me fancy your writing. Thank you for sharing precious information with us. Best machine learning with python training in hyderabad service provider.
ReplyDeleteI like the work which you have placed in this article. This data is significant and superb. I might want to thank you for sharing this article here. saudi arabia translator
ReplyDeleteThis is really a good source of information, I will often follow it to know more information and expand my knowledge, I think everyone should know it, thanks Best Arabic Translation Services service provider.
ReplyDeleteucuz takipçi
ReplyDeleteucuz takipçi
tiktok izlenme satın al
binance güvenilir mi
okex güvenilir mi
paribu güvenilir mi
bitexen güvenilir mi
coinbase güvenilir mi
A very delightful article that you have shared here.Online Language Translation Agency Ireland Your blog is a valuable and engaging article for us, and also I will share it with my companions who need this info. Thankful to you for sharing an article like this.
ReplyDeleteGreat job for publishing such a nice article. Your article isn’t only useful but it is additionally really informative. Read more info about customer onboarding kpis. Thank you because you have been willing to share information with us.
ReplyDeleteI see all the pictures you mentioned on your blog about artificial intelligence course. And that really impressed me. You have great knowledge of this. Please share more information with us
ReplyDeleteThanks for sharing the best information and suggestions, it is very nice and very useful to us. I appreciate the work that you have shared in this post. Keep sharing these types of articles here.Best computer vision course online
ReplyDeletelisans satın al
ReplyDeleteen son çıkan perde modelleri
uc satın al
özel ambulans
en son çıkan perde modelleri
yurtdışı kargo
nft nasıl alınır
minecraft premium
This is excellent information that is shared by you This information is meaningful and magnificent for us to increase our knowledge about it. Keep sharing this kind of information. Thank you. Read more info about Cantonese Courses Hong Kong
ReplyDelete