First MPEG-7 Awareness Event 

PARIS, France

 

Palais des Arts et des Congrès d'Issy

25 Avenue Victor CRESSON

92130 ISSY les MOULINEAUX

http://www.paci.issy.com/

AE (PACI) Site Coordinator: MR. Patrick Foucteau

 

13:00 - 18:00

Saturday, October 28th, 2000

 

Presentations: MOLIERE ROOM

Exhibition: FOYER DEBUSSY

           

Chair: Neil Day

Digital Garage, Inc., Japan

neil@garage.co.jp

 

Co-Chair: Eric Rehm

Singingfish.com, USA

rehm@singingfish.com

 

 Logistics Information

http://www.cselt.it/mpeg/events/mpeg-7ae/

 

The XML Journal Issues sponsored by SYS-CON Media, Inc

http://www.sys-con.com/xml/

 


 

First MPEG-7 Awareness Event

Speaker Presentation Schedule

 

1:00-1:05                              Welcome

Multimedia

1:05-1:15                              00M7M001: Presentation: MPEG-7 Conceptual Model

1:15-1:25                              00M7M002: MPEG-7 Visual Annotation Tool

1:25-1:45                              00M7V008: Hierarchical Summary Browser

           00M7V009: Table of Contents (ToC) Browser

           00M7V010: SmartEye

1:45-2:00                              00M7M004: Internet Streaming Media Metadata Interchange using MPEG-7

2:00-2:15                              00M7M005: The MPEG-7 Experimental Model (XM)

2:15-2:20                              Additional Q/A

Audio

2:20-2:35                              00M7A001: Spoken Content

2:35-2:50                              00M7A002: CUIDADO

2:50-3:05                              00M7A003: Music Retrieval by Melodic Query

3:05-3:10                              Additional Q/A

 

3:10-3:25                              Break

Visual

3:25-3:40                              00M7V001: Search Engine Tool

3:40-3:55                              00M7V002: Video Editing DS

3:55-4:10                              00M7V003: MPEG-7 Video Browser and Highlight Generation Tool

4:10-4:25                              00M7V004: Video-over-IP (VIP)

4:25-4:40                              00M7V005: Edge histogram descriptor

4:40-4:55                              00M7V006: Video Annotation and Summaries

4:55-5:10                              00M7V007: MPEG-7 Video Object Segmentation and Retrieval

5:10-5:25                              00M7M003: Wireless Images Retrieval using Speech Dialogue Agent

5:25-5:40                              00M7V011: MPEG-7 Camera

5:40-5:45                              Additional Q/A

Panel Session

5:45-5:50                              00MP7IFG001:     MPEG-7 in the 21st. Century - Short Introduction

5:50-6:15                              Discussion

6:15-                                      Thanks and Close


MULTIMEDIA

 

Session Chair:                 Mr. John Smith, IBM, USA.

 

00M7M001: Presentation: MPEG-7 Conceptual Model

The MPEG-7 Conceptual Model provides a model of the audio-visual domain at the conceptual level, which is independent of the design and implementation of the MPEG-7 Description Schemes and Descriptors. The MPEG-7 Conceptual Model defines each principal concept in words and employs modeling constructs of entities, relationships, and attributes in modeling the audio-visual content description concepts.  In this talk, we describe the role of the MPEG-7 Conceptual Model in creating the MPEG-7 Standard and examine its potential use in tools for creating MPEG-7 Descriptions.

 

00M7M002: Demo:  MPEG-7 Visual Annotation Tool

The MPEG-7 Visual Annotation tool enables users to interactively create MPEG-7 descriptions using MPEG-7 Description Schemes and Descriptors. The tool takes as input an MPEG-7 Schema definition file and an MPEG-7 package description file. The MPEG-7 Schema defines the structure of the MPEG-7 description components using the MPEG-7 Description Definition Language (DDL).  The Package description organizes the MPEG-7 description components in order to improve the ease of navigation in the MPEG-7 Visual Annotation Tool.  The tool provides utilities for drag-and-drop copying and re-using of description elements and allows the output of the descriptions in XML to files.  The initial implementation centers around manual entry of description data, however, in future work we plan to explore the integration of automatic and semi-automatic feature extraction methods with the goal of providing a complete system for MPEG-7 multimedia content annotation and query building.

 

Contact: John Smith,

Email: jrsmith@watson.ibm.com

Manager, Pervasive Media Management

IBM T. J. Watson Research Center, 30 Saw Mill River Road, Hawthorne, NY 10532

(914) 784-7320;

 

00M7M003: Wireless Images Retrieval using Speech Dialogue Agent

The agent in the client terminal recognize user's utterance in English/Japanese with rather dedicated sentences and send a query profile to the server using a wireless tranceiver channel (32kbps).  The server will retrieve the requested images and deliver the compressed video bitstream (H.263) to the client. Then the client agent will reply with a synthesized voice and display the images. Now the original format of the metadata is being used, but the MPEG-7 format would be used in the near future for all the clients, servers, and channels.

Contact: Mikio Sasaki:

Email: msasaki@rlab.denso.co.jp

Research Laboratories, DENSO CORPORATION

500-1 Minamiyama, Komenoki-cho, Nisshin-shi, Aich-ken,470-0111 Japan

00M7M004: Internet Streaming Media Metadata Interchange using MPEG-7

Singingfish.com uses MPEG-7 description schemes to model the Internet streaming media metadata. This presentation describes our use of MPEG-7 description schemes to define a schema for the XML interchange of Internet streaming media metadata with several of our commercial content partners.

The goal of such metadata interchange is to populate our search index with the highest quality and most semantically rich metadata possible, ultimately yielding superior relevance to the end user. 

The presentation includes a short demonstration of the fidelity of a transformation from MSNBC's "Partner XML Format" to an MPEG-7 XML description.

 

Contact: Eric Rehm:

Email: rehm@singingfish.com

Singingfish.com / Thomson Multimedia, Seattle, WA, USA.

 

 

00M7M005: The MPEG-7 Experimental Model (XM)

This presentation covers:

1. The Basic structure of the MPEG-7 XM Software

2. A Graphical User Interface for the MPEG-7 XM Software

3. Key Applications for the MPEG-7 XM

                a) Search and retrieval

                b) Transcoding

4. Combining visual low level descriptors in a search application

 

Contact. Mr. Stephan Hermann

Email: stephanh@lis.e-technik.tu-muenchen.de

Affiliation: Munich University of Technology,

                               Institute for Integrated Circuits

Munich, Germany

 


 

AUDIO

Session Chair:                 Mr. Vincent Puig, IRCAM, France

 

00M7A001: Spoken Content

"At the awareness event we will present the Spoken Content description scheme, along with a basic Web application to

illustrate the concept and its applications."

Canon Research Centre Europe (CRE), [along with our collaborators at IBM (Almaden)] have proposed the MPEG-7 Spoken Content description scheme. Searching and indexing audio-visual data using the speech in the sound track is, perhaps, one of the most natural metadata retrievals and our metadata format is especially designed to store the (sometimes erroneous) output of a speech recognition system in a manner most suited to robust retrieval.  We are performing research in a large range of potential applications of such data using textual and/or verbal querying.

 

Contact: Dr. Wilson Chiu, Phil Garner

Email: wilsonc@cre.canon.co.uk

Email: philg@cre.canon.co.uk

Canon Research Centre Europe Ltd

 Tel: +44 1483 448844   Fax: +44 1483 448845

00M7A002: CUIDADO

Content-based retrieval of Music and Audio samples

Information overload, inability to quickly browse through audio, poor added-value to music via Internet distribution, keyword dictatorship, inability to search for similarities among sounds : these are music consumer complaints addressed by IRCAM’s CUIDADO project. It aims at developing content-based technologies using and contributing to the MPEG 7 standard. Building reusable modules for audio feature extraction, music indexing, database management, networking and constraint based navigation, CUIDADO targets two pilot applications:

1) The Music Browser features musical paths and automatic compilations according to user’s tastes, search for music similarities, learning systems based on user’s profiles. One version is tied to Web music monitoring and another to Web music sales and customized radios.

2) The Sound Palette involves musicians and studios for developing an authoring tool both online and in an existing professional audio environment taking full advantage of the extracted audio features for innovative retrieval, editing and processing.

CUIDADO is expected to bring Studio Online to a mature stage based on MPEG7 standard.

High impact on Music providers and labels involved in Web distribution is expected. Assuming that music value is currently decreasing in itself, this application should give an evidence that new services and interfaces for accessing music and sounds may bring more value than the music itself in the future context of Electronic Music Distribution (EMD). This project should also raise copyright societies and music labels awareness on their role in using new content-based tools for music promotion and music protection.

Contact: Vincent Puig (Managing Director),

Email: Vincent.Puig@ircam.fr

IRCAM, 1 place Igor Stravinsky, 75004 Paris.

00M7A003: Music Retrieval by Melodic Query

Identifying a musical work from a melodic fragment is a task that most people are able to accomplish with relative ease. For some time now researchers have worked to give computers this ability as well, which has come to be known as the "query-by-humming" problem. To accomplish this, it is reasonable to study how humans are able to perform this task, and to assess what features we use to determine melodic similarity. Research has shown that melodic contour is an important feature in determining melodic similarity, but it is also clear that rhythmic information is important as well. The system to be demonstrated uses our proposed MPEG-7 description scheme for melody, which incorporates melodic contour and rhythmic information as the primary representation for music search and retrieval.

Additional front-end processing (to process queries), a medium-sized database of music, and a search engine (for finding appropriate matches) have also been implemented to complete the full query-by-humming system.

 

Contact: Youngmoo Kim,

Email: moo@media.mit.edu

Machine Listening Group, MIT Media Lab., Boston, USA

http://sound.media.mit.edu/~moo

 


 

 VISUAL

 

Session Chair:                 Munchurl Kim, Ph. D., Electronics and Telecommunications Research Institute, Korea.

 

00M7V001: Search Engine Tool

Visualization of MPEG7 Similarity Retrieval of 2D and 3D Data

On the upcoming awareness meeting, an application will be presented, that allows visualization of similarity-based retrieval results. This so-called Search Engine was applied for Core Experiments of visual descriptors. A graphical user interface is used for a number of functionalities, e.g.

-          Browsing of image databases

-          Visualization of 3D data and image sequences

-          Similarity Search for a number of visual descriptors

The SearchEngine is a Java-based application, that incorporates underlying functionality of C- or C++-based extraction and similarity matching algorithms. For sequence playback, an MPEG-player is included. A small 3D viewer was also added in Java3D technology. For comparable results within the MPEG-7 Core Experiments for visual descriptors a console application, called MPEG-7 XM was used among the participants. This XM-Software is also integrated into the SearchEngine. Certain basic image features are analyzed for similarity-based retrieval by this GUI:

-          Texture

-          Color

-          Contour/Shape

-          3D Geometry by analyzing a number of 2D projections from 3D object

-          Different motion in Sequences (e.g. background motion from left to right)

Contact: Karsten Müller,

Email: kmueller@hhi.de

Heinrich Hertz Institute

Einsteinufer 37, 10587 Berlin, Germany

Tel: +49 30 31002 225, Fax: +49 30 392 7200

  

00M7V002: Video Editing DS

To highlight the basic elements of the Video Editing DS, two applications have been developed to edit and browse the description of the video temporal structure specified in the MPEG-7 format. This temporal structure describes various types of temporal units : shots, rushes and composition  segments. The way these units are edited is also described in terms of transition or composition effects.

The browser offers some navigation functionalities to quickly access specific parts of a video document regarding the way it has been built.

The editor allows the completion of a partial description of a video structure that could have been provided by a video-to-shot segmentation algorithm.

 

Contact: Rosa Ruiloba, Philippe Joly

Email: rosa.ruiloba@lip6.fr, Philippe.Joly@lip6.fr

Indexation Multimedia

Laboratoire d'informatique de Paris 6 - LIP-6/UPMC

Bureau C1219 tel : (33).(0)1.44.27.88.48

8, rue du Capitaine Scott 75015 Paris

 

00M7V003: MPEG-7 Video Browser and Highlight Generation Tool

Background

As in the case of abstracts describing papers in the classical sense, a video summary is an ‘audiovisual’ abstract of a video program, which allows for quick understanding of the underlying story of the program. We can capture the whole story by glancing over the summary. The structure of the summary description is hierarchical so that coarse-to-fine navigation is possible in order to access more detailed information (contents). Furthermore the MPEG-7 summary structure allows for an event-based summary with which customized browsing and filtering is possible on the summary.

 

1. Video Summary Generator

A video summary generator creates video summaries of highlights automatically and/or semi-automatically, using low level audiovisual features and high level semantics, assisted by content analysis and highlight detection rules, respectively. It outputs description data that contain a set of highlights, composed of video summaries, that are derived from the MPEG-7 Summarization DS (Description Scheme). The generated short video highlight summaries can be used with an electronic program guide (EPG) or with a video-browsing tool in personal storage devices. The Video Summary Generator also generates a CC (closed-caption) text DB, which consists of keywords extracted from CC text, using text analysis and time codes to indicate ‘keyword-sychronized’ video locations obtained by speech recognition in the audio track, in order to support text-based retrieval of news video clips.

 

2. MPEG-7 Video Browser

The generated summary description data is fed in to an MPEG-7 video browser. The MPEG-7 video browser allows for quick overview utilizing, audiovisual highlights with different time durations, efficient browsing through non-linear navigation (based on multi-level hierarchical highlights and associated key-frames), and a ‘highlights-view’ and browser based on particular events. It also provides CC-text-based retrieval of news video clips. The video browser can be used as a video-browsing/-retrieval tool in personal storage devices in digital broadcasting and internet environments

 

Contact: Munchurl Kim

Email: mckim@etri.re.kr

Participants: Munchurl Kim, Hyun Sung Chang

Affiliation: Electronics and Telecommuncations Research Institute

Country: Korea

 

 

00M7V004: Video-over-IP (VIP)

Full streaming over the internet of both content and MPEG-7 metadata.

The Video-over-IP project (VIP) is an integration project carried out in the Netherlands. Various partners are involved, like the Telematica Instituut, NOB, SurfNet, IBM, and TNO. In general, the purpose of the VIP project is to allow for the production, storage, management, retrieval, and exploration of video content for a specific set of users. Moreover, these services should be interoperable on the Internet. The following general activities should be possible:

 

·       The production of digitised video material (media objects), ready for distribution over the Internet

·       The production of content (video material plus metadata), including the management of this production process

·       Digitising video and other material in various formats

·       Extending the video material with additional descriptions (metadata) for disclosure, either (semi-) automatically, or manually. In order to search in the content, parts of the video should be properly described.

·       Indexing and retrieval of content

·       End users should be able to search in the content

·       Search, retrieval- en browsing facilities, including a user interface

·       Security against improper use (encryption and watermarking)

·       Distribution of high-quality video to the end user over the IP network

·       The realisation of a network architecture needed for offering these services with a high quality of service

·       Charging the end users on the basis of the delivered content and services (content-based billing & accounting).

 

Contact: Erik Oltmans:

Email: oltmans@telin.nl

Telematica Instituut (www.telin.nl), The Netherlands

 

 

00M7V005: Edge histogram descriptor

Short Description: The edge histogram descriptor represents local edge distribution on 4*4 sub-images. Five types of edges, namely four directional edges and one non-directional edge, are defined for each sub-image. So, there are a total 16*5=80 histogram bins.

Function (in one sentence): Image to image matching, especially for natural images with non-uniform edge

distribution.

Benefit for Applications: Since the descriptor is based on the edge information in the image, it is good for natural image matching. Since edges play an important role for image perception, it can retrieve images with similar semantic meaning.

Potential Users:

-           Image search (retrieval) by example or by sketch

-           Scene change detection

-           Key frame clustering

 

Contact: Soo-Jun Park

Email: psj@etri.re.kr

Senior Member of Engineering Staff

ETRI-CSTL

161 Kajong-dong, Yoosung, Taejon, 305-350, Korea

URL:    http://sir.etri.re.kr/~soop

(phone) +82-42-860-6899, (fax) +82-42-860-4889

 

00M7V006: Video Annotation and Summaries

 

1. Video annotation editor

The system can automatically generate video transcripts using speech recognition and make a correspondence between video scenes and words. The system can also detect scene change boundaries. The user of this system can modify automatically-generated transcripts and scene boundaries. The user can also annotate some keywords and comments on objects in video frames. The system generates XML-formatted annotation data that contains all information created through user interaction.

 

2. Video player with summarization function

The system can generate summaries of video clips with annotation data and play them. The user can input any keyword that will contribute to customization of video summaries. The player can also show transcript text synchronized with video like closed caption. The user can also select any scenes from the scene index window.

 

Contact: Katashi Nagao, HASIDA Koiti:

Email; KNAGAO@jp.ibm.com

IBM Tokyo Research Laboratory

Email:  hasida@etl.go.jp

Director of Information Science Division,

Electrotechnical Laboratory, (ETL), Ibaraki, Japan.

 

00M7V007: MPEG-7 Video Object Segmentation and Retrieval

We will present a video object segmentation system, AMOS, and a video retrieval and visualization application.

Currently, fully automatic segmentation of semantic objects is only successful in constrained visual domains. The AMOS system takes on a powerful approach in which automatic segmentation is integrated with user input to track semantic objects in video sequences. For general video sources, the system allows users to define an approximate object boundary by using a tracing interface. Given the approximate object boundary, the system automatically refines the boundary and tracks the movement of the object in subsequent frames of the video. The system is robust enough to handle many real world situations that are hard to model in existing approaches, including complex objects, fast and intermittent motion, complicated backgrounds, multiple moving objects, and partial occlusion. For each video sequences, the description generated by this system is a set of semantic objects with the associated regions and visual features that can be manually annotated with text. Text annotations can also be assigned to the video sequence.

The video retrieval and visualization application developed during a Core Experiment within MPEG-7 uses the descriptions generated by AMOS to retrieve and visualize videos based on the annotations and visual features. This application supports (1) query by example based on any combination of visual features and text annotations (e.g., retrieve video sequences with similar objects based on color and texture); (2) query by keyword based on text annotations (e.g., retrieve video sequences with “elephant”); and (3) advanced visualization of the retrieved results based on panoramic views and segmented objects.

 

Contact: Ana Belen Benitez

Email: ana@ee.columbia.edu

Electrical Engineering Department

Columbia University, 1312 Mudd, #F6, 500 W. 120th St, MC 4712, New York, NY 10027

Voice: +1 212 854-7473  Fax: +1 212 932-9421

URL: http://www.ee.columbia.edu/~ana/

 

 

00M7V008: Hierarchical Summary Browser

Category:      Application of the Summary DS.

Features:                Summary Theme based Audio-Visual Summary Selection

                                Presentation Time based Audio-Visual Summary Selection

Abstract:               Hierarchical Summary Browser is based on the Summary DS which is in the category of navigation and access. The functionality of the proposed hierarchical summary browser includes dynamic audio-visual summary generation following the user’s selection of the summary theme and summary length in time. By allowing users to select preferred summary length, the hierarchical level of the provided summary can be automatically selected so that the length of the summary is closest to the user’s request. By allowing users to select preferred theme of the summary, audio-visual summaries of various length with the selected theme can be dynamically generated, so that the user can select the length. The combined selection of the themes and length are also available. Such a hierarchical summary browser can be also used in accordance with the user preference, so that the preferred theme and the length can be automatically selected based on the user preference.

 

00M7V009: Table of Contents (ToC) Browser

Category:      Application of the Segment DS and Graph DS.

Features:                ToC based Audio-Visual Content Navigation

                                Abstract/Detail* relation based Navigation

                                Cause/Effect** relation based Navigation

Abstract:               The ToC browser is based on the segment DS and the Graph DS. The ToC browser interface provides tree-structured interface of the selected content so that a user can select interested segment of the content. Each segment is represented by a representative key frame, and the selected segment is summarized by a list of key frames. Based on the abstract/detail and cause/effect relationships defined using the Graph DS, a user can select segments in the abstract/detail/cause/effect relation. The abstract/detail relation provides two segments one of which is an abstract version of the other and the latter is a detailed version of the former segment. The cause/effect relation provides two events one of which causes the other and the latter is the result of the former event.

 

*Abstract/detail are proposed normative types of relations

**The effect relation is equivalent to the result relation which is a proposed normative relation type and the cause relation can be considered as the inverse relation of the result

 

00M7V010: SmartEye

Image Retrieval System with Relevance-Feedback based Image Characterization

Category:      Application of the MatchingHint DS.

Features:                Image retrieval using multiple descriptors with different weights. Automatic learning MatchingHints by user’s feedback

Abstract:               Generally, relevance feedback has been utilized only to refine the query conditions in image retrieval. However, in our Application, the usage of the relevance feedback is extended to the image database categorization so as to be accommodated to user independent image retrieval. In our approach, to guarantee a user-satisfactory performance, descriptors and the elements of the descriptors corresponding features of each image are weighted using the relevance feedback. We use the MatchingHint DS for weighting descriptors and elements of each descriptor based on color and texture descriptors. In addition, our system uses the appropriate learning method based on the reliability scheme preventing wrong learning from wrong feedback.

 

Contacts: Heon Jun Kim, Ph.D.,

Email: hjk@lge.co.kr

Senior MTS

Also: Kyoungro Yoon, Jin-Soo Lee, Jung-Min Song

MI Group, Information Technology Lab.

LG Corporate Institute of Technology, 16 Woomyeon-Dong, Seocho-Gu, Seoul, Korea 137-724

TEL:  +82 526 4132, FAX:  +82 526 4852

 

 

 

00M7V011: MPEG-7 Camera

In collaboration with the EPFL, FASTCOM Technology S.A. has developed, around its smart camera product, a MPEG-7 "standard" communication layer making searchable the content of its video output. We will present our ongoing development around MPEG-7 and the automatic metadata creation tools.

 

Contact: Nicolas Pican

Email: pican@fastcom-technology.com

FASTCOM Technology S.A.

 


MPEG-7 in the 21st Century

 

00MP7IFG001: Short Introduction

A panel discussion, in reflecting over the day’s presentations and demos, will discuss immediate issues concerning MPEG-7 applications for the marketplace, the MPEG-7 Industry Focus Group and plans for future activities.

 

Contact: Neil Day and Witold Reichhart

Email: neil@garage.co.jp

Digital Garage Inc.

Manager, Strategic Research & Development Department,

Yamazaki Bldg. 5F, 2-43-15 Tomigaya, Shibuya-ku,, Tokyo 151-0063, Japan.

Tel: +81-3-5454-7213, Fax: +81-3-5454-7218

Digital Garage: http://www.garage.co.jp, WebNation: http://www.webnation.co.jp

 

Witold Reichhart

Email: witold@starlab.net

Starlab Research Laboratories

Boulevard St-Michel 47,, 1040 Brussels - Belgium

Tel : +32 2 7400 740

WWW: http://www.starlab.net

 

 

Panel Discussion

Panel Host:

Neil Day

 

Panel Members:

Philippe Salembier, Rob Koenen, Eric Rehm, Vincent Puig, Munchurl Kim, Witold Reichhart.

 


 

First MPEG-7 Awareness Event

Profiles of Session Chairs and Presenters

 

 

Multimedia Session

 

John Smith:         

Session Chair                00M7M001                00M7M002

John R. Smith is currently Manager of the Pervasive Media Management Group at IBM T. J. Watson Research Center.  His research interests include multimedia and multi-dimensional data management, compression, access and retrieval and content-based query systems.  Dr. Smith is an active participant in the MPEG-7 Multimedia Description Schemes Group and is chairing the development of the MPEG-7 Conceptual Model. He received his M. Phil and PhD. degrees in Electrical Engineering from Columbia University in 1994 and 1997,  respectively.  At Columbia, he developed several image and video search and retrieval systems, including the WebSEEk image and video search engine, the VisualSEEk content-based image retrieval system.  At IBM, he has developed a progressive video retrieval system called VideoZoom, and a new framework for adaptive compression, access and retrieval of large images, high-resolutions documents and maps.  Dr. Smith received the Eliahu I. Jury award from Columbia University for outstanding achievement as a graduate student in the areas of systems communication or signal processing.  Dr. Smith is an Adjunct Professor at Columbia University and a member of IEEE.

 

Mikio Sasaki

00M7M003

Mikio Sasaki is Project Leader in Research Laboratories of DENSO CORPORATION which is the largest automotive parts company in Japan. Mr. Sasaki is now in charge of R&D for related media processing technologies used in IT equipment such as car navigation, mobile phone, etc. He is especially engaged in the development of human-machine interfaces and media communication-based speech dialogue agents and their related data expressions such as MPEG-7. He is also very much interested in image understanding and has two US patents for 3D recognition for robotics and image coding.

He received a BS in Electronics from Kyoto University, an MS in Electronics from the University of Tokyo. Until October 1991, he had been working for YAMAHA, the well-known Japanese maker of musical instruments and was also engaged in R&D at EMI for related digital circuits and MPEG-1 related image technologies.

 

Eric Rehm

Co-Chair of MPEG-7AE                Panel Member                00M7M004

Eric Rehm is cofounder and Chief Technical Officer of Singingfish.com. Singingfish.com has developed, streaming media search services that are marketed and licensed and marketed to a broad range of high-traffic web portals, search and directory sites, broadband service providers, content aggregators, news organizations, entertainment networks and others.  Eric is responsible for creating the company's technical vision and putting in place the system architecture that enables  Singingfish.com search technology to function across PC, wireless, television and other computer and entertainment platforms.

Prior to founding Singingfish.com, Eric architected and implemented system software for Equator Technologies' MAP1000 multimedia processor. Before joining Equator, Eric served at Digital Equipment Corp. for nine years, where he worked as a principal engineer on the initial implementation of Windows NT on Digital's Alpha platforms.

Eric holds an MS in Computer, Information and Control Engineering from the University of Michigan, and a BS in Electrical Engineering from Purdue University.  He's completed graduate work in Computer Scienceat University of Washington.

 

Stephan Herrmann                             00M7M005

Affiliation:                Munich University of Technology,

                                Institute for Integrated Circuits

Resume:

Stephan received in 1994 a Diploma degree in Electrical Engineering from the Berlin University of Technology.

In 1995 he joined the Heinrich Hertz Institute (HHI) in Berlin and since 1996 has been with the Institute for Integrated Circuits at the Munich University of Technology as a Research Assistant. His major interest is in algorithms, and hardware architectures for image analysis and image segmentation. Since 07.1999 Stephan has been Chairman of the MPEG-7 AHG for XM Development.

 

Audio Session

 

Vincent Puig

Audio Session Chair     Panel member                00M7A002          

Marketing Director, IRCAM - Centre Pompidou

With a first background in Business Administration, he specialized in Technology Transfer and went out to New York for two years as Commercial Attache at the French Embassy in charge of electronics and software. In 1988, he became consultant at Innovation 128, a company specialized in Technology monitoring and Technology transfer. In 1993, he started a new activity at IRCAM-Centre Pompidou as Marketing Director. At this time he created the Forum Ircam a software user group intended for musicians, institutions and home studios found of computer music. It is currently gathering more than 1300 users worldwide. In 1995 he set up a telematic project for professional studios named Studio On Line which was retained in the first « Information Highway » call for proposal of the French Ministry of Industry and completed with Java/Corba technologies in December 1998. It offers on the Web a large database of more than 120.000 samples of instruments together with unique online processing functions. In June 1998, for the first edition of the Ircam festival, he set up a collaboration with CICV and ENSAM for the presentation of two virtual reality installation : Icare (Yvan Chabanaud, Roland Cahen), the Cistercian Model (Catherine Ikam et Louis Fléri). Then in the same context he presented Coney Island (Robin Bargar, NCSA) in 1999 and in 2000 “Elle et la voix” (Catherine Ikam, Louis Fléri, Pierre Charvet). From 1998 to 2000 he has been coordinator of a European working group on content processing of music in relation with MPEG7 (CUIDAD - Esprit program) and of a European group studying user needs and interfaces in musical libraries, in collaboration with BNF (Harmonica - Telematics program). Within CUIDAD he managed a team working on instrument timbre Core Experiment. In 1999, he presented a new project on content-based audio and music retrieval called CUIDADO which has been recently retained in the IST European call. This project aims at developing a Music Browser with Sony France and a Sound Editing environment with CreamWare (D) both using automatically extracted audio descriptors. The Music browser project has been supported by the most important copyright societies in Europe since it should provide tools for automatic music recognition and monitoring of copyrighted recordings available on the Web. 

 

Philip N. Garner, Wilson S.C. Chiu                00M7A001

Philip N. Garner is a researcher at Canon Research Centre Europe Ltd in the UK. His research interests include speech recognition, pattern recognition and statistics. Before joining CRE, Philip was at the Defense Evaluation and Research Agency in Malvern, UK, and studied Electronic Engineering at Southampton University. He is a chartered Engineer.

 

Wilson S.C. Chiu is the Chief Software Architect at Canon Research Centre Europe Ltd in the UK, his role involves devising application and system architectures, and the design and implementation of prototype systems. Before joining CRE, Wilson worked as a software engineer at Vickers Medelec Ltd on the development of neurodiagnostic instruments. Wilson studied Electrical Engineering and received a PhD in visualisation of speech production at Southampton University. He did

postdoctoral work in 3D Ultrasound Imaging at St. Thomas' Hospital in London.

  

Youngmoo Kim                00M7A003

Youngmoo Kim is a PhD candidate in the Machine Listening Group of the MIT Media Lab where his research activities have focused on audio signal processing and digital audio coding.  His primary research involves novel techniques for coding and synthesizing the singing voice.  He is an active participant of the MPEG audio standards technical committee, having contributed to MPEG-4 audio and now MPEG-7. Youngmoo received an MS in Electrical Engineering and a MA in Music, both from Stanford University and a BS in Engineering and BA in Music, both from Swarthmore College.


 

Visual Session

 

Munchurl Kim

Session Chair                Panel Member                00M7V003          

Munchurl Kim has received the B.E. degree in Electronics from Kyungpook National University, Korea in 1989, and M.E. and Ph.D. degrees in Electrical and Computer Engineering from University of Florida, Gainesville, USA, in 1992 and 1996, respectively.

Since 1997 he has been with Electronics and Telecommunications Research Institute (ETRI), Korea, where he is currently in charge of developing data broadcasting technology with MPEG-4/7 applications. His research area includes visual information processing, data broadcasting, and multimedia communications. 

 

Karsten Muller                00M7V001

Karsten Muller or Mueller, (IEEE M'98), received the Dipl. Ing. Degree from the Technical University of Berlin, Germany, in 1997. He has been with the Heinrich-Hertz-Institute, Berlin, since 1996, where he is working on projects focused on motion and disparity estimation, representation of 2-D and 3-D shapes, and viewpoint synthesis. He has been involved in MPEG activities, creating, testing and cross-checking MPEG-7 visual descriptors. Currently he develops algorithms that combine 2D-image and 3D-object representation.

 

Rosa Ruiloba                  00M7V002

Rosa Ruiloba prepares a PH.D in the Laboratoire d'Informatique de  Paris 6. She obtained the diploma of Electronic Engineer from the Escuela Tecnica Superior de Ingenerios de Telecomunicaciones of Valladolid University (Spain) and, in 1997, a Post-Graduate Degree in Image and Artificial Intelligence in the Ecole Nationale Superieure des Telecommunications de Bretagne. Currently, she works on video-to-shots segmentation algorithms, their evaluation and

comparison and contributes to MPEG-7 on the Video Editing DS.

 

Philippe Joly                00M7V002

Philippe Joly is a professor assistant at the Laboratoire d'Informatique de Paris 6 where he heads a research group on multimedia indexing. He works mainly on video features extraction and on audiovisual content description. He obtained a PH.D in Computer Science in 1996 at the University of Toulouse III.

 

Erik Oltmans                00M7V004

Erik Oltmans works with the Telematica Institute in Enschede, the Netherlands. His work is concerned with applied research on metadata issues, interoperability, streaming technologies and content engineering. He chairs the MPEG-7 Adhoc group on Metadata Integration, and he is workpackage manager within the Dutch Video-over-IP project.

 

Paul Porskamp                00M7V004

Paul Porskamp is senior Application Developer at the Telematica Instituut in Enschede, the Netherlands. His work is concerned with content management with special interest in content value chains and the relation to web enabled systems and their architectures. He is responsible for all architecture issues in the Dutch Video-over-IP project, in which

distributed content production environments and distributed deployment environments play a central role.

Soojun Park                 00M7V005

Affiliation: ETRI (1994 ~ present)

Position: Senior Member of Engineering Staff

Education:

M.S. degree in Computer Science at Lehigh University, U.S.A.

B.S. degree in Biochemistry at the University of Iowa, U.S.A.

Interested areas:

MPEG-7, Content-based Information Retrieval, Natural Language Processing, Bio-Informatics.

 

Koiti HASIDA                00M7V006

Koiti HASIDA was born in 1958. He received his B.S. (1981), M.S. (1983), and D.S. (1986) degrees from the University of Tokyo.  He joined ETL (Electrotechnical Laboratory) in 1986 and has been the director of Information Science Division since 1999.  He was also affiliated with ICOT (Institute of New Generation Computer Technology)

from 1988 to 1992.  His current research commitments include intelligent content and constraint-based natural language processing.

 

Katashi NAGAO                00M7V006

Katashi NAGAO was born in 1962. He received his B.E. (1985), M.E. (1987), and D.E. (1994) degrees from Tokyo Institute of Technology. He joined IBM Tokyo Research Laboratory in 1987 and Sony Computer Science Laboratory in 1991. Currently, he is a senior researcher at IBM Tokyo Research Laboratory and conducting a research project on Semantic Transcoding of online contents and Semantic Discovery from semantically-annotated data. His major interests include natural language processing, human-computer interaction, and intelligent agents and robots.

 

Ana B. Benitez                00M7V007

Ana B. Benitez is a Ph.D. candidate in the Department of Electrical Engineering at Columbia University, New York, USA, since 1996. She received her Telecommunications Engineer degree from the Polytechnic University of Catalonia (UPC) in Barcelona, Spain, in 1996. She was awarded a full scholarship for graduate studies in the United States by the Spanish financial institution, "la Caixa". She received her M. Phil. degree from Columbia University in 1996. Her current research interests include integration of large distributed multimedia information retrieval systems and multimedia content representation. She is an active participant in the MPEG-7 standard where she is the Chair of the AHG on MPEG-7 MDS Core Experiments and an Editor of the MDS experimental Model and Working Draft documents. She is also a student member of IEEE and ACM.

 

Heon Jun Kim, Ph.D                00M7V008                00M7V009                00M7V010

Heon Jun Kim received his B.E. degree in Metallurgical Engneering from Yonsei University in 1988, M.S. and Ph.D. degrees of Computer Science from Stevens Institute of Technology in 1996. Once he was a research engineer of Media Communication Lab. in LG Electronics and developed a face recognition system and image retrieval engine. Currently, he works for LG Electronics Institute of Technology as a Senior Member of the Technical Staff. Now he takes charge of Content based Multimedia Information Processing Project and participates in MPEG-7 standardization.

 

Kyoungro Yoon, Ph.D