Three Dimensional Mapping from Two Dimensional Images from a Single Point-Of-Focus for Obstacle Avoidance

29 May 2016

Obstacle avoidance has long been one of the biggest challenges in the field of computer vision and it holds tremendous value for its applications in the field - from self-driving cars to wearable tech for the visually impaired and autonomous drones for emergency response. Currently there are some available algorithms for this task but they take input from multiple cameras from multiple points-of-focus, therefore exponentially increasing the data that needs to be processed and requiring costly, high-powered computers to process the algorithms, making them impractical to be used on wearable technology and light-weight autonomous drones. For the past couple years, this topic has seized my interest and I've been researching a lot about it.

Why Cameras? Can't Simple Ultrasonic Sensors Be Used?

When I first started reading about this problem, this was the first thought that came to my mind. The problem initially seemed simple. Let's consider the case of an autonomous car. We place one ultrasonic sensor on each side of the car and we can continuously measure the distance the car is from obstacles on each side and use those values in an algorithm to avoid obstacles. However, there are multiple flaws with this basic methodology. But before I go on explaining these problems, let's take a step back and first understand how the ultrasonic sensor works.

How the Ultrasonic Sensor Works and its Flaws

The basis of the technology inside the ultrasonic sensor is very similar to how bats use echo location to navigate in the night. The ultrasonic sensor sends a sound wave, typically between 25kHz and 500kHz, and times how long it takes for the wave to hit an object and bounce back to the sensor's ping reciever. To calculate the distance traveled by the sound wave we can use: $d=\frac{v_{sound}\times{t}}{2}$ , where $d$ is the distance from the object and the sensor, $v_{sound}$ is the average speed of sound constant and $t$ is the time elasped from when the initial ping was released to when the ping was recieved back.

How an Ultrasonic Sensor Works. Figure borrowed from Aimagin.

Now, let's discuss some of the fundamental problems with this method:

In this method we use the speed of sound as a constant. However, the speed of sound changes depending on the medium through which the sound is traveling. To factor in this additional variable of the medium through which the sound is traveling additional sensors, such as a gas sensor, should be added to the car. For example, if the gas sensor gives a high ppm reading for $CO_2$ concentration in the air, accordingly, a different constant will be used for the speed of sound.
The sound wave is not only traveling through air, but also hitting an object and bouncing back to the sensor. Depending on the material composition of the object, the object can either absorb the sound wave, change the frequency of the sound wave, or change the angle of the sound wave so that it does not return to the sensor. In a practical situation, if we have an autonomous car driving around a neighborhood, it's hard to control what objects the sound waves hit and it is difficult to get a good approximation for the speed of sound constant.
Unlike cameras, ultrasonic sensors don't have a "wide-angle". What I am getting to here is ultrasonic sensors can only measure the distance from an object directly in front of it. If we only had one sensor on each side of the car, we would only be measuring the distance from an object that is directly in front of the sensors. Therefore to be of any use, we would need to place several sensors spanning throughout each side of the car. With this the data needed to be processed increases significantly and due to the other two problems listed above, it will still not be accurate enough for practical use.

Optical Flow Modeling With Multiple Cameras - Data Processed Increases Exponentially

Currently, there are a few methods to use multiple cameras to create a three dimensional map of the environment around the car with relative locations of obstacles to the car's position. Here is a general overview of the process:

As car moves forward, initial images from both cameras are taken. The images are blurred, warped, and grayscaled. Guassian pyramids are created to model optical flow.
As car continues to move forward, vectors are drawn to model motion.
Trigonometric algorithms are used to model location of obstacles based on the camera angle and relative magnitudes of the vectors.
Distance values are used to create a local map of the car's location relative to the obstacles' locations.

However, there are some limitations that are worth noting:

Even though we reduce the information that needs to be processed by blurring, warping, and grayscaling the images, there is still a lot to process in real-time.
Since we are using multiple cameras the algorithms are more complex and the data needed to process grows exponentially, making it difficult to run on cheap light-weight platforms.

Monocular Mapping of a Three Dimensional Environment Using a Single Camera

Due to limitations becuase of the exponential increase in the computing load with multiple cameras, a new field of study, monocular mapping, which uses a single camera to map a multi-dimensional environment is being researched. For the past year, my curiousity has been captured by this field and I've created a base set of novel algorithms that can do this task fairly well. They're not computationally intense and can be processed on low-cost embedded systems. Although monocular mapping is currently in its initial stages of studying, I believe it can have a huge imapct on the world at large - obstacle avoidance algorithms can be processed on tiny embedded systems for applications from autonomous cars to wearable technology for the visually impaired!

Conclusions

Obstacle avoidance and autonomous navigation continues to be one of the biggest challenges in the field of computer vision. If you're interested and would like to share ideas, please feel free to email me. I'm always excited to hear new ideas!

Hi! I'm Venkat Krishnan.

I'm interested in Science and Technology and enjoy working on projects in these areas.