Daily Research Log 2014-4-15


  • A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms
    Daniel Scharstein, Middlebury College, ICCV
    • Review, Taxonomy and Evaluation
    • Mainly two-frame stereo, multi frame dataset given later
  • Efficient Large-Scale Stereo Matching, A. Geiger, ACCV2010
    • This paper focuses on the efficiency
    • The crucial point of stereo vision is matching
    • Review part of this work divides the previous work on stereo vision into global matching, semi-global matching and local matching


  • Edited proposal, added comparison between our problem, SfM, stereo vision

Daily Research Log 2014-4-10


  • Wide Baseline Stereo Matching based on Local , Affinely Invariant Regions, Tuytelaars, BMVC2000
    • Get affine invariant regions, descriptors, and match
    • Method
      • Extract region by local intensity extrema
      • Get ellipse fitting by 2nd order momentum, extend the ellipse
      • Get descriptor by invariants of moment


  • Slides for research proposal


Notations: I for set of pixels, S^r, S^t for set of segmentation in reference and target images, P^r, P^t for set of anchor points in both images, H for set of local homographies defined on segments, F for set of anchor points matching.



\mathcal{H}: C\rightarrow \mathbb{R}^{3\times 3},

\mathcal{L}^x: P^x\rightarrow S^x,

\mathcal{F}: P^t\leftrightarrow P^r,


Problem formulation:

\mathcal{L}^t, \mathcal{L}^r, \mathcal{H}, \mathcal{F} = \underset{\mathcal{L}^t, \mathcal{L}^r, \mathcal{H}, \mathcal{F}}{\arg \min} E(\mathcal{L}^t, \mathcal{L}^r, \mathcal{H}, \mathcal{F})

PenPal Projector Project

Pretty tricky name indicates a pretty tricky project


PenPal is a pen that enables you to write in the air and laser-print the message on the wall from another one. It's only a design now and a proof-of-concept demo available.

The system uses a vector laser projector so it can be used in any situation, on any surface and in any lighting conditions.

Most importantly, it looks cool.

Hardware Setup

  • 3rd party
    • Input
      • Leapmotion
      • Mini mouse
    • Output
      • Laser
      • Oscillating laser mirror
        • The mirrors are driven by a pair of  servo motors. We bought the device from Taobao.
    • Wave generator
      • Arduino
        • We tried this at the very beginning. However it turned out that plain API calling could be very sluggish hence the degenerated laser shape. Though it can be solved by calling the basic level API of Arduino, we have to spend a lot of time on it. So we aborted this plan
      • PC sound card
        • The PC seems to be almighty. With a sound API we can manipulate the sound card easily and use it as an arbitrary function generator


  • Design
    • AI
      • SVG exporter
  • Output
    • Python, pyaudio
      • Stream control
        There are two ways of manipulating the stream: `write & block`, `callback`. To achieve desired effect, we have to use `callback` since the wave form changes in real-time.
        To manipulate the stream, we have to write binary float data into the buffer. This answer@SO should be helpful in doing this. (http://stackoverflow.com/a/22644499/1921437).
      • Core code as follows

        import pyaudio
        import wave
        import time
        import sys
        import numpy as np
        RATE = 44100
        def decode(in_data, channels):
            Convert a byte stream into a 2D numpy array with 
            shape (chunk_size, channels)
            Samples are interleaved, so for a stereo stream with left channel 
            of [L0, L1, L2, ...] and right channel of [R0, R1, R2, ...], the output 
            is ordered as [L0, R0, L1, R1, ...]
            # TODO: handle data type as parameter, convert between pyaudio/numpy types
            result = np.fromstring(in_data, dtype=np.float32)
            chunk_length = len(result) / channels
            assert chunk_length == int(chunk_length)
            result = np.reshape(result, (chunk_length, channels))
            return result
        def encode(signal):
            Convert a 2D numpy array into a byte stream for PyAudio
            Signal should be a numpy array with shape (chunk_size, channels)
            interleaved = signal.flatten()
            # TODO: handle data type as parameter, convert between pyaudio/numpy types
            out_data = interleaved.astype(np.float32).tostring()
            return out_data
        p = pyaudio.PyAudio()
        pt = 0
        def callback(in_data, frame_count, time_info, status):
            global pt
            wave = np.ndarray((frame_count, 2))
            for i in range(frame_count):
                wave[i,0] = float((i+pt) % 100) / 100
                wave[i,1] = float((i+pt) % 80) / 100
            pt = pt + frame_count
            print pt
            print wave
            return (encode(wave), pyaudio.paContinue)
        stream = p.open(format=pyaudio.paFloat32,
        while stream.is_active():


  • The laser mirror is not perfect in terms of temporal response, which means a pulse in the input signal will not result in a sharp pulse in the projected shape.  To tune this, the most fundamental way is to tune the PID coefficients of the servo mirror system. However, this seems to be time-consuming. So we decided to make a compromise by decreasing the frequency of the projector.
  • One of the most important parameters responsible for the quality of the shape should be the projector frequency. When the frequency is decreased, the system response becomes better. However this shall result in a low fresh-rate, or a less complex shape as a trade off. No free lunch.



Daily Research Log 2014-4-8


Literature review for my "match & map" project


  • Efficient Large-Scale Stereo Matching, A. Geiger, Urtasun
    • The blurry image of generating right observation comes from
      • Here d_n is generated by sampling from this distribution function
    • This method probably won't work on a wide-baseline stereo vision dataset since the following reason
      • If the triangulation is dense, large displacement and distortion of support points across two views will ruin the triangulation
      • If the triangulation is sparse, it won't be able to cope with complex scene


  • Review papers citing Wide baseline stereo matching, P Pritchett, A Zisserman, ICCV'1998
  • Run Geiger's work
  • Write project proposal
    • Desired result
      • On Middleburry Benchmark, works slower that Geiger ACCV2010, but slightly better
      • On wide-baseline dataset, overperform Geiger's work by a large margin

Daily Research Log 2014-4-7


Literature review for my "match and map" project


  • Efficient Large-Scale Stereo Matching, A. Geiger, Urtasun
    • Matching problem formulation
      • This problem is right our problem!
      • However, we can achieve sub-pixel precision by finding mapping function, and is more precise since the piece-wise planar model can fit the scene in a more detailed way
      • And our method works on images taken form very different perspectives
    • Model

    • Solve
      • Discrete, MAP
  • Wide baseline stereo matching, P Pritchett, A Zisserman, Computer Vision, 1998
    • Algorithms for homography
      • Surface following using a single projective homography
        • Iteration of
          • RANSAC of a fixed region à homography
          • Find matches using homography à increased region
        • I would rather call it surface extending
      • Surface following using affine transformations
        • Procedure
          • Initial matches obtained from local homography
          • Divide the selection window into four
          • RANSAC the four subdivisions, get sub-homographies
          • New region centered upon the matched basis is stored
        • I would call it picewise planar surface growing
    • Computation of fundamental matrix
      • 7-point correspondences
  • On the computing of fundamental matrix, Ramani's lecture slides
    • F can be computed form correspondences between image points alone
      • Independent of camera internal parameters (Projective transform as a prior)
      • Independent of relative pose
    • Properties of F
      • Rank 2
      • 9 parameters, 7 dof (scaling, Det(F) = 0)
    • 7-point algorithm, 8-point algorithm


  • Review probabilistic graphical model
  • Read review for stereo vision by Scharstein, IJCV2002, A taxonomy and evaluation of dense two-frame stereo correspondence algorithms
  • Find papers referring to A. Geiger's paper.
  • Try A. Geiger's method on our data

Daily Research Log 2014-4-6


Literature review for my "match and map" project


  • Efficient Large-Scale Stereo Matching, A. Geiger, Urtasun

    • Previous work
      • Global matching v.s. Local matching
      • Klaus et al. extend global methods to use appearance-based segmentation, belief propagation and super pixel
      • Hirschmuller et al.'s semi-global matching which propagates information of scan-line methods along 16 orientations
    • Method
      • Stereo matching are largely ambiguous while some are of high confidence
        • Remaining ambiguous points are "propagated" by these high confidence "support points"
      • Delaunay triangulation, piecewise linear


  • Paper on segmentation + stereo matching
    • Klaus, A., Sormann, M., Karner, K.: Segment-based stereo matching using belief propagation and a self-adapting dissimilarity measure. In: ICPR. (2006)


Daily Research Log 2014-4-5


  • Literature review for my "match & map" project
  • Try to understand Hongdong Li's paper on non-rigid scene reconstruction


  • A Simple Prior-free Method for Non-Rigid Structure-from-Motion Factorization, Hongdong Li CVPR2012
    • The block matrix method


  • What is "shape bases"?

    In practice, many non-rigid objects, e.g. the human face under various expressions, deform with certain structures. Their shapes can be regarded as a weighted combination of certain shape bases. Shape and motion recovery under such situations has attracted much interest. Xiao et al.


  • Review Homography, Semi-global Stereo Matching related papers


  • Too good a weather to read papers.