Accurate Stereo Vision

State of the art

In the last decades stereo vision has been one of the most studied task of computer vision and many proposals have been made in literature on this topic. The problem of stereo correspondence can be formulated as follows: given a pair of rectified stereo images we need to find for each point pr of the reference image its correspondent pt on the other image, which, due to the epipolar constraint, lies on the same scanline as pr and within the disparity range D = [dmin; dmax].

Today dense stereo techniques are mainly divided into two categories: local approaches and global approaches. For what regards local approaches, in order to increase the accuracy of disparity estimations, particularly along depth borders, state-of-the-art algorithms deploy a variable support to compute the local matching cost rather than using, as in the traditional approach, a fixed squared window.

Conversely, most global methods attempt to minimize an energy function computed on the whole image area by employing a Markov Random Field model. Since this task turns out to be a NP-hard problem, approximate but efficient strategies such as Graph Cuts (GC) and Belief Propagation (BP) have been proposed. In particular, a very effective approach turned out to be the employment of segmentation information and a plane fitting model within a BP-based framework.

A third category of methods which lies in between local and global approaches refers to those techniques based on the minimization of an energy function computed over a subset of the whole image area, i.e. typically along epipolar lines or scanlines. The adopted minimization strategy is usually based on the Dynamic Programming (DP) or Scanline Optimization (SO) techniques. The global energy function to be minimized includes a pointwise matching cost and a smoothness term which enforces constant disparity e.g. on untextured regions by means of a discontinuity penalty. These approaches achieved excellent results in terms of accuracy in the disparity maps and in terms of very fast, near real-time, computational performances.

Research topics

We have been investigating novel strategies for stereo matching based on both local and global approaches. In particular, we have proposed a novel local approach method based on a variable support. Our approach exploits color and segmentation information in order to robustly aggregate only those pixels which lay on the same disparity plane as the point for which the stereo correspondence is currently evaluated. This allows to greatly enhance accuracy along depth borders, as well as decreasing the matching ambiguity along low-textured and high-textured planes. [1]

Moreover, our idea is that the use of a local cost based on a variable support aggregation stage can greatly improve global stereo methods. Hence, we have properly embodied in a SO-based approach the local matching cost computation previously outlined [2] , demonstrating that this improves the accuracy yielded by SO-based approaches, as well as the WTA-based method deploying the same local cost. Besides, the algorithm proposed in [2] embodies an effective strategy for localizing depth borders and occlusions, which is usefully employed to further improve accuracy in a final disparity refinement stage.

Preliminary results are shown on the Middlebury dataset [3] by comparing the provided ground truth with the results yielded by our local stereo matching approach as well as with the global approach. These results were also submitted to the Middlebury Stereo Evaluation page.

References

[1] F. Tombari, S. Mattoccia, L. Di Stefano, “Segmentation-based adaptive support for accurate stereo correspondence", IEEE Pacific-Rim Symposium on Image and Video Technology (PSIVT 2007), 2007.
[2] S. Mattoccia, F. Tombari, L. Di Stefano, "Stereo vision enabling precise border localization within a scanline optimization framework", 8th Asian Conference on Computer Vision (ACCV 2007), 2007.
[3] D. Scharstein, R. Szeliski, "A taxonomy and evaluation of dense two-frame stereo correspondence algorithms", IJCV 47(1/2/3):7-42, 2002.