Author|Victor
Editor|Qing Mu
On December 9, 2021, it was organized by the Guangdong-Hong Kong-Macao Greater Bay Area Artificial Intelligence and Robotics Federation, The 6th Global Artificial Intelligence and Robotics Conference co-organized by Leifeng.com officially kicked off in Shenzhen. More than 140 industry and academia leaders and 30 Fellows gathered to discuss AI technology, products, industry, humanities, organization and other dimensions, with a rational approach. With analysis and perceptual insight as the axis, we will jointly climb to the top of the wave of artificial intelligence and digitalization.
On the next day of the conference, Li Shipeng, director of Sier Laboratory, former executive director of Shenzhen Institute of Artificial Intelligence and Robotics, academician of the International Eurasian Academy of Sciences, and IEEE Fellow, gave a speech on "Frontiers of Artificial Intelligence and Robotics" at the GAIR conference Lecture on "Reflections on Research".
Dr. Li Shipeng, IEEE Fellow, academician of the International Eurasian Academy of Sciences. He has successively served as chief scientist and executive director of the Shenzhen Institute of Artificial Intelligence and Robotics, vice president of iFlytek Group and co-director of iFlytek Research Institute, and founding member and deputy director of Microsoft Research Asia. Academician Li is quite influential in the fields of multimedia, IoT and AI. He holds 203 U.S. patents and has published more than 330 cited papers. Listed as one of the world's top 1,000 computer scientists by Guide2Research. Cultivated four MITTR35 Innovation Award winners. He is one of the founders and joint secretary-general of the New Generation Artificial Intelligence Industry Technology Innovation Strategic Alliance.
In his speech, Li Shipeng introduced and looked forward to the cutting-edge research directions of artificial intelligence and robotics. He pointed out that in the future, machine learning may be able to break through the data bottleneck of deep learning with the help of cognitive science methods, and the learning paradigm can be From relying on big data to relying on big rules; human-machine collaboration must also evolve into human-machine collaboration. Only by incorporating coupling, interaction, enhancement, complementation and other goals into the research direction can the seamless connection of humans and machines be achieved.
The following is the full text of the speech, which has been edited by AI Technology Review without changing the original meaning:
Today’s speech is titled “Thoughts on Frontier Research in Artificial Intelligence and Robotics” and is divided into three parts: In this section, we will first talk about
artificial intelligence
and
robot research panorama
, and then
focus on the research direction
, including machine learning, motion intelligence, human-machine collaboration, and group collaboration; and finally a summary.
There are three key elements in artificial intelligence-related research:
People, robots/Internet of Things, and AI
. The reason why robotics and the Internet of Things are grouped together is because they are the interface between the physical world and the virtual world. If three elements are connected in pairs, a new subject will be formed. For example, the combination of robots and AI will produce intelligent agents, the combination of AI and humans will produce human-machine coupling and enhanced intelligence, and the fusion of robots and humans will Form an enhanced body. With the development of the fields of artificial intelligence and robotics, the research object is no longer limited to a single agent, but more and more research is conducted on the collaboration of multiple agents. For example, how can human social groups better integrate? How to design a swarm of machines that can collaborate exquisitely?
In general, I think the important basic research directions are:
Machine learning, motion intelligence, human-machine collaboration, and group collaboration.
1
Focus on machine learning
The development of machine learning is inseparable from the blessing of deep learning, which brings many research results to the industry and empowers It has promoted the rapid development of the artificial intelligence industry in aspects such as speech recognition, face recognition, object recognition, and autonomous driving.
Although the results are quite fruitful, the failure is as bad as the success. Deep learning relies on big data, and its bottleneck also lies in big data. For example, although domestic intelligent voice technology is leading the industry, it still relies on technology accumulation and data accumulation. Now, if you want deep learning to exert its huge power, you still need the support of a large amount of data. If you want to expand deep learning from one field to another, data support is also indispensable.
How to break through? Researchers have explored multiple paths, and one solution is to:
Extend the deep learning framework.
For example, optimizing deep learning algorithms, knowledge graph deep learning, expert systems deep learning, etc. Another path is
Causal reasoning
. Its goal is to rely on human beings’ ability to draw inferences from one instance and to explore beyond the correlation between data, and then explore the causality between data, so as to obtain Logical reasoning between data.
The third path is
Brain-like computing
, which explores the cognitive elements and mechanisms of the human brain from a biological perspective and reproduces the human brain with simulation methods .
Personally, I think cognitive science is the starting point for breaking through the deep learning framework. The reason is that there are two points in the human cognitive process that we need to learn from: we are born with knowledge and we learn with knowledge.
Being born knowing means that some cognitive abilities are innate. Newborns have many innate connections in their cranial nerves. What it enlightens us is: Most of the current deep learning algorithms are trained from scratch without making full or efficient use of prior knowledge or existing models. How to utilize existing knowledge is the next hot direction of deep learning.
Learning to know means that most cognitive abilities are learned, especially early learning. More connections are made through learning cranial nerves. Many of children's abilities, including perception, coping, language, reading, writing and understanding, and even the ideas and abilities to analyze and solve problems have been basically finalized at a very young age; in the future, it will basically be the accumulation of knowledge. This means that brain neurons are connected and finalized into a meta-model very early, and the rest is just to use this meta-model to solve problems in specific fields. This bears striking similarities to current large-scale pre-trained models.
Another level of learning and knowing is that the human learning process relies on multi-source, multi-sensory, multi-modal and multi-angle data, such as vision, hearing, smell, touch and context. and other joint information, and today’s deep learning mostly relies on a piece of speech or a photo. Therefore, the input data of the future AI model may not only be a single data, but a fusion of multiple signal sources. How to imitate the human learning process is another revelation of deep learning from cognitive science.
Furthermore, the human learning process is a process from sample examples to principle induction, rather than just staying at the sample level; currently, deep learning only stays at the sample level. So, will it be possible to construct a human-like machine learning framework in the future, so that no matter what kind of data is input, as long as the logic is consistent, it will converge to a consistent model?
To break through the data bottleneck of deep learning, you can try to build a crowdsourcing system of rules and let humans teach the machine learning process. The purpose is not to input data, but to let the machine learn the rules. Since we try to learn rules from daily activities, such rules can be marked and taught by ordinary people, which breaks the previous limitation of experts systematically requiring experts. This transition from big data to big rule model construction is obviously more in line with human cognition.
2
Focus on motion intelligence
As we all know, in the field of robotics, Boston Dynamics’ products are the most human-like. As shown in the above animation, the robot can’t be seen dancing at all. It doesn’t feel stiff. However, due to limitations of computing resources, energy, and motion control, it can only run for tens of minutes. In fact, the operation mode of Boston Dynamics robots is based on motor drive, which has many shortcomings, such as rigid motion, large self-weight, contradiction between reaction speed and flexibility, and high energy consumption.
Comparing the way humans and other animals operate, the combination of muscles, bones, sensors and nerves can achieve flexible operation with low energy consumption. The inspiration this gives researchers is that the robot's operating system should meet the same requirements as humans: efficient, flexible, precise, robust, rigid and flexible, lightweight, adaptive and other indicators. The current sports intelligence may perform well in a certain dimension, but it still has many shortcomings when taken into consideration.
Therefore, an important research direction of sports intelligence is: bionics. Modeled after the movement intelligence of animals, for example, movement control adopts proximity feedback, and the movement process can be flexibly adjusted at any time depending on changes.
If robots are driven by internal forces, medical micro-nano robots are representatives of external force research directions. For example, relying on magnetism, small robots can precisely transport drugs from one pipe to another.
3
Focus on the direction of human-machine collaboration
At the level of human-machine collaboration, it is different from collaboration. Harmony represents the coupling, It means interaction, enhancement, complementarity, collaboration, harmony, etc. The goal of human-machine collaboration is that the machine can understand the human intention without telling the robot, thereby achieving a seamless connection between human and machine.
In the process of achieving human-machine harmony, the focus is on the study of natural human-machine interaction, perception and enhancement. Specifically, it may include: biometric detection and recognition, human-computer interface, brain-computer interface, speech recognition, action recognition, expression recognition, language understanding, intention understanding, body posture perception, seamless enhancement, and the extension of extended reality and remote reality, etc. wait.
In terms of human-machine augmented intelligence, today’s machine learning frameworks are mostly deep learning frameworks based on big data, and there will definitely be situations that machine intelligence cannot handle. This is fatal for certain high-risk areas, such as autonomous driving and finance.
The current solution to this problem is human takeover. This will involve three core questions:
Core question 1: How does machine intelligence sense that it cannot handle some situations and actively ask people to take over?
Core question 2: When can humans completely let go and let machines complete tasks autonomously?
Core question 3: What kind of human-computer interaction design can give full play to the respective strengths of humans and machines without unnecessary trouble for each other?
If the three core issues cannot be solved, they will lead to some dilemmas. For example, take autonomous driving as an example. At present, safety officers do not just turn on the automatic function once and for all. They still need to monitor road conditions and routes at all times and cannot be distracted for a moment. This actually increases the burden on safety officers, because when there is no autonomous driving, humans have certain predictions about their driving environment, but humans cannot predict the situation of machine driving.
Human-machine augmentation of the body is also a field of human-machine collaboration, which can help humans enhance their physical capabilities and complete things that humans cannot do with their own physical strength. But machines can be too complex and require human training to operate. The future goal of human-machine augmentation is to achieve a harmonious coexistence between humans and machines, and to control them as naturally as humans' own organs. Among them, the core research topics involved include: machine perception of human intentions, human posture, understanding of human natural language commands, body language, etc., so that machines can help humans solve problems in a smooth way that is most suitable for human acceptance and just right.
4
Focus on group collaboration
Currently, single agents can complete many tasks, but how to unleash the collective power of each agent? This involves the research direction of group collaboration. In warehousing scenarios, there are many grabbing and classifying robots. If they can be dispatched effectively, work efficiency will be greatly improved.
The current mainstream scheduling method is a centralized control method, but in the face of tens of thousands of agents, decentralized control is needed to allow autonomous behavior between agents. While collaborating with each other, you can also do your own thing. That is, a single intelligent agent that can act independently achieves more efficient group/system intelligence and behavior through collaboration.
The rules currently involved in intelligent agent group collaboration include group behavior models and incentive mechanisms, and group intelligent collaborative decision-making. In this regard, ants are our learning objects. In addition, in terms of autonomous driving, more and more autonomous driving robots are appearing, and how to achieve collaborative sensing and collaborative control between them is also a hot topic today.
The above four aspects are basic research. A breakthrough in any one field will be a revolutionary breakthrough for its field and downstream applications, and will also bring original industrial digital intelligence. Technological innovation will help us occupy an advantageous position in the competition!
Leifeng.com Leifeng.com