Distributed Processing in Smart Cameras

Taipei, Taiwan
Monday April 20, 2009

Hamid Aghajan, Stanford University (USA)
Andrea Cavallaro, Queen Mary, University of London (UK)


Distributed vision networks is a multi-disciplinary research field that defines rich conceptual and algorithmic opportunities for the fields of computer vision, signal processing, pervasive computing, and wireless sensor networks. It also creates opportunities for design paradigm shifts in these fields given its emphasis on distributed and collaborative fusion of visual information, enabling researchers in the named areas to participate in the creation of novel smart environment applications that are interpretive, context aware, and user-centric in nature.

Technological advances in the design of image sensors and embedded processors have facilitated the development of efficient embedded vision-based techniques. Design of scalable, network-based applications based on high-bandwidth data such as images requires a change of paradigm in the processing methodologies. Instead of streaming raw images to a powerful, central processing unit, in the new paradigm each network node utilizes local processing to translate the observed data into features and attributes, which are shared with other network nodes to make a collaborative deduction about the event of interest. Building upon the premise of distributed vision-based sensing and processing, many novel application areas in smart environments such as patient and elderly care, ambient intelligence, multimedia and gaming can be enabled.

Design based on access to a networked set of sources of visual information provides researchers in the vision processing discipline with novel research opportunities not only through introducing the notion of spatial collaboration in data fusion among the network nodes, but also in considering the effective ways of network-wide collaboration and levels in which data can be exchanged between the nodes. In such a framework data fusion can occur across the three dimensions of 3D space (multiple views), time, and feature levels. Joint estimation and decision making techniques need to be developed taking into account the processing capabilities of the nodes as well as the network bandwidth limits and application latency requirements. Spatiotemporal data fusion algorithms employing information obtained by the network across the dimensions of space, time, and feature levels and the impacts of the various cost and efficiency tradeoffs need to be examined against the requirements of the application. In addition, deductions produced by collaborative processing in the network can be used as feedback to each camera node to enable active vision techniques by instructing each camera as to which features may be of importance to extract in its view.

The target audience for the course consists of researchers active in various signal processing applications for camera networks applications such as human presence and gesture analysis, as well as graduate students involved in vision algorithm design research. The course offers a perspective of the various methodologies based on the flexibilities and tradeoffs introduced by distributed vision sensing and processing. As a result of providing such perspective, the course aims to encourage participation of vision researchers in developing novel algorithms based on the potentials of distributed camera networks.


  • Part I:
    • Introduction
    • Information filtering and metadata
    • Distributed processing
    • Calibration and topology
  • Part II:
    • Smart Cameras
    • Case Study Human pose analysis
  • Part III:
    • Active and heterogeneous networks
    • Tracking
    • Fusion
    • Large scale trajectory reconstruction for behaviour analysis
  • Part IV:
    • Challenges of vision
    • Interfacing vision
    • User-centric design
    • Killer apps?

Interesting Links


For further inquiry about the special session program please contact:
aghajan AT stanford.edu
andrea.cavallaro AT elec.qmul.ac.uk