An autonomous vehicle is able to navigate city streets and other less crowded areas, recognizing pedestrians, other vehicles and potential obstacles through them AI. This is achieved with the help of artificial neural networks, which are trained to “see” the surroundings of the car, mimicking the system of visual perception of man.
But unlike humans, cars using artificial neural networks have no memory of the past and are in a constant state of seeing the world for the first time – no matter how many times they have driven on a particular road before. This is especially problematic in adverse weather conditions, when the car cannot safely rely on its sensors.
Researchers from Cornell Ann S. Bowers College of Computer and Information Science and the College of Engineering have created three simultaneous research reports to overcome this limitation, giving the car the ability to create “memories” from previous experiences and use them in future navigation.
Ph.D. Yurong Yu is the lead author of “HINDSIGHT is 20/20: Using Past Tours to Support 3D Perception,” which you presented virtually in April at ICLR 2022, the International Conference on Learning Representations. “Learning ideas” includes deep learning, a type of machine learning.
“The fundamental question is, can we learn from repeated tours?” Said a senior author. Killian Weinberger, Professor of Computer Science at Cornell Bowers CIS. “For example, a car can confuse a tree with a strange shape for a pedestrian when the laser scanner detects it from a distance, but once it is close enough, the category of the object will become clear. So the second time you pass the same tree, even in fog or snow, you will hope that the car has learned to recognize it correctly.
“In fact, you rarely ride a route for the first time,” said co-author Katie Luo, a doctoral student in the research group. “Either you or someone else has driven it recently, so it seems only natural to gather this experience and use it.”
Led by PhD student Carlos Diaz-Ruiz, the group compiled a data set by driving a car equipped with LiDAR light detection and range sensors, repeatedly on a 15-kilometer circuit in and around Ithaca, 40 times over 18 months. The tours capture different environments (highway, city, campus), weather conditions (sunny, rainy, snow) and hours of the day.
This resulting dataset – which the group calls Ithaca365 and which is the subject of one of the other two articles – has more than 600,000 scenes.
“He deliberately reveals one of the key challenges in self-driving cars: bad weather,” said Diaz-Ruiz, co-author of Ithaca365. “If the street is covered in snow, people can count on memories, but without memories, the neural network is at a great disadvantage.”
HINDSIGHT is an approach that uses neural networks to calculate object descriptors as the car passes them. It then compressed these descriptions, which the group called SQuaSH (Spatial-Quantized Sparse History) functions and stores them on a virtual map, similar to “memory” stored in the human brain.
The next time the self-driving car passes through the same place, it can search the local SQuaSH database at any point in LiDAR along the route and “remember” what it learned last time. The database is constantly updated and shared between vehicles, thus enriching the available information for performing identification.
“This information can be added as features to any LiDAR 3D object detector;” You said. “Both the detector and the presentation of SQuaSH can be trained together without additional supervision or human annotation, which is time consuming and time consuming.”
While HINDSIGHT still suggests that the artificial neural network is already trained to detect objects and complements it with the ability to create memories, MODEST (Discovering Mobile Objects with Ephemerality and Self-Learning) – the subject of the third publication – goes even further.
Here the authors leave the car to learn the whole pipeline of perception from scratch. Initially, the artificial neural network in the vehicle was never exposed to any objects or streets at all. Through multiple tours of the same route, he can learn which parts of the environment are stationary and which are moving objects. You are slowly learning what other traffickers are and what is safe to ignore.
The algorithm can then detect these objects reliably – even on roads that were not part of the original multiple rounds.
Researchers hope that both approaches could drastically reduce the cost of developing autonomous vehicles (which currently still rely heavily on expensive annotated human data) and make such vehicles more efficient by learning to navigate the places where they are used the most.
Both Ithaca365 and MODEST will be presented at the IEEE Conference on Computer Vision and Model Recognition (CVPR 2022), to be held June 19-24 in New Orleans.
Other contributors include Mark CampbellProfessor of Mechanical Engineering John A. Melous ’60 at the Sibley School of Mechanical and Aerospace Engineering, Assistant Bharat Hariharan and Wen Sung, from Computer Science at Bowers CIS; former postdoctoral fellow Wei-Lun Chao, now an assistant professor of computer science and engineering in Ohio; and PhD students Cheng Perng Fu, Xiangyu Chen and Junan Chen.
The research for all three articles was supported by grants from the National Science Foundation; The Office of Naval Research; and Semiconductor Research Corporation.