Lei feng’s network: author Dr Tomasz Malisiewicz,CMU. Introduced three major paradigm of AI: logic, probability and deep learning. iPhone wallet

Today, we work together to review the past 50 years of artificial intelligence (AI) formed the three paradigms in the field: logic, probability and deep learning. Now, whether by experience and “data-driven” approach, or big data, the concept of deep learning, have been popular, but early was not the case. Many of the early artificial intelligence is based on logic, and logic to a data-driven method of transition probability theory deeply affects thinking, then we will talk about the process.

According to chronological, first logic, and probability plots and make forecasts the future direction of artificial intelligence and machine learning.

Figure 1: Photo Coursera probabilistic graphical models

One, logic and algorithms (common-sense “thinking” machines)

Many early AI work is focusing on logic and automated theorem proving and the manipulation of symbols. John McCarthy in 1959, wrote a seminal paper called for common sense is homeopathy for the programming.

If we read one of the most popular AI textbooks–the artificial intelligence: a modern approach (AIMA), we will direct attention to the book begins is to search first-order logic, constraint satisfaction problems, and planning. Third edition cover (see the following figure) like a big chessboard (exquisite because chess is a sign of human intelligence), or printed with Alan Turing (father of computer theory) and Aristotle (one of the great classical philosophers, a symbol of wisdom) pictures.

Cover of 2:AIMA, which is standard in CS undergraduate AI courses teaching

However, based on logical AI cover the perception problem, and I have long advocated the principle of understanding perception is the key to solve the riddle of intelligence. Is aware that category for what is easy to machine and hard to grasp. (Further reading: is the computer vision AI, author of 2011 post) logic is pure, traditional chess robot is purely of algorithms, but the real world is ugly, dirty and filled with uncertainty.

I think most modern artificial intelligence researchers believe that AI is dead based on logic. Everything perfectly observed, the world does not exist error not a robot and the real world within the data. We live in an era of machine learning and digital technologies to defeat the first-order logic. In 2015, I am for those who defend ponens abandoned gradient fool are regret.

Explaining logic is suitable for use in the classroom, I wonder if there is sufficient awareness has become “essentially solved”, we will see the recovery logic. Cognition of the future there are many open questions, then there are a lot of scenes, in these situations the community won’t have is a perception problem, and started looking at these ideas. Maybe in the year 2020.

Second, probability, statistics and graphical models (“measure”)

Uncertainty of the probabilistic approach to artificial intelligence is used to solve the problem. Artificial intelligence: a modern approach to middle section of the book “uncertain knowledge and reasoning”, vividly describes these methods. If the first time you pick up the AIMA, I suggest that you read it from the beginning of this section. If you are a newcomer to AI students, do not save your work in mathematics.

Figure 3: from probability theory and mathematical statistics course at Penn State University’s PDF file

Most people referring to the probability method, just count. Laymen are easy to take for granted that probability is fancy counting method. We briefly review the past statistic methods of thinking about the same in both.

Frequency theory relies on experience–they are data-driven and rely purely on data inference. Bayesian methods are more complex, and it combines data-driven likelihood and prior. These a priori, often from first principles, or “intuition”, Bayesian methods are very good at getting the data combine to make more intelligent algorithms and heuristic thinking-the rationalist and empiricist world view of the perfect combination.

The most exciting, then the frequency theory and Bayesian, is something called probabilistic graphical models. This kind of technology from the field of computer science, machine learning is an important part of CS and statistics now, statistics and operations when combined with its powerful ability to truly release.

Probabilistic graphical model is a combination of graph theory and probability method, mid 2000 they were once the rage in a machine learning researchers. When I was in graduate school (2005-2011), calculus of variations, and Gibbs sampling and belief propagation algorithm was deeply implanted in the brain in the CMU graduate students and our thinking provides an excellent framework for machine learning problems. I know most of the knowledge about the model is from Carlos Guestrin and Jonathan Huang. Carlos Guestrin is now GraphLab Corporation (now renamed Dato) CEO, this company’s products are used in machine learning of image production. Jonathan Huang now is Google’s senior research fellow.

The video below is an overview of GraphLab, but it is also perfectly described “Visual thinking”, as well as modern data scientists how to use it. Carlos is an excellent lecturer, and his speech is not limited to the company’s products, more is to provide the next generation of machine learning system.

(Figure 4: calculation method of probabilistic graphical models | Professor Dato CEO,Carlos Guestrin)

If you feel that deep learning can solve all machine learning problems, I really have a good look at the video above. If you are building a recommender system, a health data analysis platform, the design of a new trading algorithm, or develop next-generation search engine, model is the perfect starting point.

Third, deep learning and machine learning (data-driven) iPhone leather wallet

Machine learning is a learning process, so the most advanced recognition technology requires a lot of training data, you should use the advanced neural network and are patient enough. Depth of learning stressed that successful machine learning algorithms of the network architecture. These methods are based on contains many hidden layers of “deep” multi-layer neural networks. Note: I would like to stress now is deep structure (2015) is no longer new. Just look at this 1998 paper “deep” structure of articles.

Figure 5:LeNet-5,Yann LeCun pioneering papers of the document recognition method based on gradient

When you read LeNet model Guide, see the following terms:

To run the sample on a GPU, you first need to have a good GPU. GPU memory at least 1GB. If the monitor is connected to the GPU, you might need more memory.

When the GPU and monitor are connected, each GPU function call has a few seconds of time. Doing so is essential, because the GPU when the operation cannot continue to monitor services. Without this limit, the display will freeze for too long, computers appear to be crashed. If the medium-quality GPU processing for this example, you will encounter problems over time. GPU not connected monitor does not exist at this time. You can reduce batch sizes to address time-out problems.

I really wonder how the Yann as early as 1998, the depth of his model to toss out some things. Not surprisingly, we guys have to take ten years to digest the contents.

Update: Yann said (via the Facebook comment) ConvNet work dating back to 1989. “It has some 400K connection, and a SUN4 machines spent approximately 3 weeks training on USPS data collection (8,000 sample). “——LeCun

Figure 6: the deep Web, Yann1989 years at Bell Labs

Note: about the same time (around 1998) California has two crazy guys in the garage trying to cache the entire Internet to their computers (who founded a company that begin with g). I don’t know how they do it, but I guess sometimes needs to be advanced do not scale to make great achievements. The world will eventually catch up.

Conclusion:

I don’t see a conventional first-order logic soon made a comeback. Although there is a lot of hype behind deep learning, distributed systems, and “graphic thinking” impact on the scientific data are more likely to share more far-reaching optimization of CNN. GraphLab-style architecture and deep learning is no reason not to, the coming decades major breakthroughs in the field of machine learning is likely to come from the combination of these two parts.

Lei feng’s network (search for “Lei feng’s network”, public interest) Note: by reading the article science and technology translation and authorized the released of Lei feng, for reprint, please contact authorized and keep the source and author, no deletion of content.