APSSM 09/10 – Multimedia Signal Processing

Instructor

Pedro M. Q. Aguiar. Office: 7.24, north tower, 7th floor. Contact: aguiar at isr dot ist dot utl dot pt

 

Description

The course deals with processing of multiple media signals, i.e., sounds, images, video. Focus is on efficient representations, e.g., multiresolution techniques, and unifying approaches, e.g., vector space framework and operators. Applications range from music transcription to 3D photography. Students will develop a computer and/or research project. Grading is 50% on project and 50% on exam. Lectures: Mondays and Thursdays, 14:00-15:30, EA3. Course topics:

o   Patterns, representations and approximations. Sinusoids in Nature. Spectral harmonics. Images and textures. Shapes and contours. Representations and approximations. Musical notation. Musical harmony.

o   Vector spaces. Definition and properties. Inner products and norms. p-norms. Hilbert spaces. Linear operators. Subspace projections and optimal approximations. Bases and frames, ONBs and TFs. Matrix view of all this.

o   Multimedia signals as vectors. Vector spaces of finite sequences and infinite sequences. Two-dimensional (images) and multidimensional (video) signals. Systems as operators and their properties. Matrix representation of linear shift-invariant systems.

o   Signal processing and operators. Analysis of signals and systems. Transform operators (DTFT, DFT, TZ) and their properties. Matrix interpretation and eigensignals.

o   Sampling, interpolation, and multirate processing. Fourier transform of continuous-time signals (CTFT, CTFS). Shannon sampling. Vector space interpretation. Projection onto the subspace of band-limited signals. Discrete-time processing of continuous signals. Multirate signals and systems. Upsampling, downsampling, polyphase representation.

o   Time (and space), frequency, scale and resolution. Time (and space) and frequency localization. Heisenberg boxes and uncertainty principle. Scale. Resolution and degrees of freedom. Music and time-frequency analysis.

o   Multiresolution image processing. Image alignment and motion estimation. Optimization algorithms in image processing. Gauss-Newton methods. Image pyramids. Superresolution.

o   Image-based rendering. Plenoptic modeling and 3D photography. Representation of light fields. Plenoptic sampling. Spectral analysis of light fields. Minimum sampling in joint image-geometry space. Examples.

o   Bases and filter banks. A single channel and its projection property. Complementary channels and completeness. ONBs. Theory of orthogonal and biorthogonal two-channel filter banks. Polyphase view in terms of multiple-input / multiple-output shift invariant systems. Filter design.

o   Wavelets. Iterated filter banks and equivalent filters. Orthogonality of iterated filters. DWT. Examples. Properties. Wavelets as an ONB. Biorthogonal DWT. Wavelet packets. Complexity.

o   Overcomplete representations. Redundancy. Frame definitions and properties. Energy bounds. Frame operators. Dual-frame operators. Seeding from a basis. Harmonic tight frames. Infinite-dimensional frames and filter banks. Oversampled DWT.

o   The quest for sparsity. Compressive sensing. Conventional sampling followed by compression. Sparse signals / sparse representations. Measurement matrix. Signal reconstruction algorithms. Geometrical interpretation. Examples.

 

Materials

o   [Vetterli et al] “The World of Fourier and Wavelets”, M. Vetterli, J. Kavacevic, and V. Goyal, 2008. Vetterli

o   [Mumford] “Music, chords and harmony – lecture notes”, D. Mumford, Brown University, 2006. Mumford

o   [Aguiar1] “Spectra of sounds and images – lecture notes”, P. Aguiar, 2008. Aguiar1

o   [Aguiar2] “Multiresolution image alignment – lecture notes”, P. Aguiar, 2008. Aguiar2

o   [Chan et al] “Image-based rendering and synthesis”, S. Chan, H-Y. Shum, and K-To Ng, IEEE Signal Proc. Magazine, 24:6, 2007. Chan

o   [Chai et al] “Plenoptic sampling”, J-X. Chai, X. Tong, S-C. Chan, and H-Y. Shum, ACM SIGGRAPH, 2000. Chai

o   [Baraniuk] “Compressive sensing”, R. Baraniuk, IEEE Signal Proc. Magazine, 24:4, 2007.             Baraniuk

o   [Aguiar3] “MATLAB – basic manipulation of sounds and images”, P. Aguiar, 2008. Aguiar3

o   Course slides linked below.

 

Project

Each team of two students must select a project among the suggested ones. At the end of the semester, each team is expected to write a project report in the form of a 4-page conference-style paper (latex template) and discuss the project, the report, and course topics, with the instructor. Lab.: Thursdays, 18:30-20:00, room 5.13 – LSDC1, north tower, 5th floor. Suggested projects:

o   Aerial acoustic communications

o   Shape-based recognition

o   Panoramic imaging

o   3D photography

 

Schedule (tentative) / summary

o   25/02/10 – Course presentation.

o   01/03/10 – Patterns, representations and approximations [Aguiar1].

o   04/03/10 – Patterns, representations and approximations [Mumford, Veterlli et al, ch. 0] [slides].

o   08/03/10 – Vector spaces [Veterlli et al, ch. 1].

o   11/03/10 – Vector spaces [Veterlli et al, ch. 1] [slides].

o   15/03/10 – JEEC

o   18/03/10 – Multimedia signals as vectors [Veterlli et al, ch. 2] [slides].

o   22/03/10 – Signal processing and operators [Veterlli et al, ch. 2] [slides].

o   25/03/10 – Problems.

o   29/03/10 – Sampling, interpolation, and multirate processing [Veterlli et al, ch. 2,3,4] [slides].

o   08/04/10 – Problems.

o   12/04/10 – Time (and space), frequency, scale, and resolution [Veterlli et al, ch. 5] [slides].

o   15/04/10 – Multiresolution image processing [Aguiar2] [slides].

o   19/04/10 – Image-based rendering [Chan et al, Chai et al] [slides].

o   22/04/10 – Bases and filter banks [Veterlli et al, ch. 6].

o   26/04/10 – Bases and filter banks [Veterlli et al, ch. 6].

o   29/04/10 – Bases and filter banks [Veterlli et al, ch. 6] [slides].

o   03/05/10 – Problems.

o   10/05/10 – Wavelets [Veterlli et al, ch. 8].

o   17/05/10 – Wavelets [Veterlli et al, ch. 8] [slides].

o   20/05/10 – Overcomplete representations [Veterlli et al, ch. 9].

o   24/05/10 – Overcomplete representations [Veterlli et al, ch. 9] [slides].

o   27/05/10 – The quest for sparsity [Baraniuk] [slides] [slides].

o   31/05/10 – Problems [example exam #1] [example exam #2].

o   01/06/10 – Deadline for project reports.

o   08/06/10 – Project discussions.

o   26/06/10 – Exam #1.

o   15/07/10 – Exam #2 [grades].

o   09/09/10 – Exam #3 [grades].

 

Last modified: September 9th, 2010.