Distributed Vision Networks

ACIVS 2007 Short Course
August 27, 2007
Delft, Netherlands


Hamid Aghajan (Stanford University, USA)
Richard Kleihorst (NXP Research, Netherlands)
Chen Wu (Stanford University, USA)

Link to course schedule at ACIVS Page

Course description

Technological advances in the design of image sensors and embedded processors have facilitated the development of efficient embedded vision-based techniques. Design of scalable, network-based applications based on high-bandwidth data such as images requires a change of paradigm in the processing methodologies. Instead of streaming raw images to a powerful, central processing unit, in the new paradigm each network node utilizes local processing to translate the observed data into features and attributes, which are shared with other network nodes to make a collaborative deduction about the event of interest. Many novel application areas in smart environments such as patient and elderly care, smart buildings, multimedia and gaming can be enabled by such distributed processing approach to algorithm design for vision networks.

To fully utilize the efficiencies offered by the new paradigm, it is necessary to consider its ramifications on the design of the hardware, in-node processing, and collaboration mechanisms. This raises issues on the effective ways of collaboration and levels in which data can be exchanged between the nodes. In addition, effective local processing methods based on opportunistic use of information acquired from the scene or received from other nodes need to be investigated. Joint estimation and decision making techniques need to be developed taking into account the processing capabilities of the nodes as well as the network bandwidth limits and application latency requirements. Spatiotemporal data fusion algorithms employing information obtained by the network across the dimensions of space, time, and feature levels and the impacts of the various cost and efficiency tradeoffs need to be examined against the requirements of the application.

The tutorial aims to provide insight into the potentials of distributed vision networks by examining novel smart camera architecture as well as distributed feature extraction and collaborative data fusion algorithms. The algorithms will be presented in the context of novel applications in smart environments such as human posture and gesture analysis for assisted living, multimedia and gaming, and speaker tracking. The impact of algorithm design decisions on the design of efficient vision processing architectures based on parallel treatment of pixel data will also be examined.


The following is a tentative syllabus of the tutorial:

I. Introduction

II. Distributed Vision Algorithms

III. Human Pose Estimation

  • a. Segmentation
  • b. Top-down and bottom-up pose estimation approaches

    IV. Smart Camera Architectures

  • a. Background on embedded vision processing
  • b. Vision processing hardware for research and application development

    V. Conclusions

    • a. New design paradigms
    • b. Research outlook and application opportunities