Standard technologies to develop the multimedia business

Leonardo Chiariglione – Telecom Italia Lab
Convenor, ISO/IEC JTC 1/Sc 29/ WG 11 (MPEG)



After 20 or so years of talks, a combination of digital signal processing know-how, ubiquitous digital networks and inexpensive computing devices heralds the advent of the multimedia age. A lot of digital audio and video – the anteroom of multimedia – has happened in the last 10 years over traditional media like CD-ROM or broadcast networks. Multimedia over the Internet, however, has been a completely different story. After a good start with such standard information representations like HTML and related transport protocols like HTTP, the following years have seen a plethora of proprietary technologies all designed to exploit what looked like boundless riches to exploit. Whether riches have been found, is an entirely different story.

As much as a standard text-and-graphics information representation decreed the success of the first-generation web, the next generation requires a standard "media" information representation. That standard already exists, its name is MPEG-4 and it provides the technical means to enable business opportunities for all.

Media standards worked in the past

Many mention the first MPEG standards as successful examples of standardised digital audio-visual technologies. Indeed, with about 100 million Video CD players and more than 100 million MPEG-2 decoders included in digital television set top boxes and DVD players all over the world, it would be hard to dispute such a statement. It is also often said that such a success is based on the fact that MPEG-1 and MPEG-2 are standards for "hardware", therefore rooted in the old "manufacturing" mentality, without any hope to provide a valid example for the "software" future of multimedia. This is hardly the case, if one thinks of the several hundreds million MPEG-1 audio-visual decoders and MP3 players, all in software, not to mention of the tens of million MPEG-2 decoders shipped with Personal Computers.

The third MPEG project was given the title "Very low bitrate audio-visual coding". Some people got confused by the ITU-T project of that time on videotelephony for analogue telephone lines. Real-time person-to-person videocommunication, however, has never been a major preoccupation for MPEG. With its third project MPEG intended to address its traditional "distribution of audio-visual content" market needs. At that time the Internet was a tool commonly used by the MPEG community for private mails and as a support for ad hoc groups, but was hardly a candidate for sending audio-visual bits. The Web was in an infant stage and was seen as even more remote as an exploitation opportunity. Also, the deployment of digital mobile networks was just beginning.

MPEG embarked in the MPEG-4 project because it was clear that while the "above 1 Mbit/s" market existed and MPEG-1 and MPEG-2 provided standard solutions, something had to happen below that bitrate. An MPEG standard in that bitrate range would become the enabler of new business opportunities, even though the actual shape of the market was unknown.

What is in MPEG-4

Over the years MPEG-4 has taken a rather complete shape. As MPEG-2, however, MPEG-4 is not a monolithic standard, but a tool-based standard that users can assemble to satisfy their own needs, still preserving a high degree of interoperability across applications.

The Visual part of the standard [2] offers a wide range of solutions for diverse application domains. Some of the most attractive technologies are:

1. The Simple Profile provides an optimum compromise between performance and complexity. This is the reason why the Simple Profile of MPEG-4 Visual is the solution of choice for the mobile industry.

2. When the highest video quality is the target and implementation cost is not an issue, the Advanced Simple Profile provides very high compression performance by using the most effective coding tools.

3. Other Profiles support "shape coding", probably the most innovative part of MPEG-4 Visual.

4. MPEG-4 Visual also offers standard solutions to animate human faces and bodies and other Synthetic Natural Hybrid Coding (SNHC) tools.

MPEG-4 Audio [3] supports the entire spectrum of bitrates from 2 to 64 kbit/s. In particular it features a very low bitrate parametric speech coder, a CELP narrowband/wideband speech coder and a transform coder for general audio signals. Moreover MPEG-4 Audio provides score driven synthesis and a Text-to-Speech interface.

MPEG-4 Systems [1] provides the ability to compose 2D and 3D audio and visual objects, a binary version (called BIFS) of an extension of the VRML 97 standard. MPEG-4 Systems also provides a powerful multimedia file format and means for plugging-in (so-called "hooks") proprietary external security modules – called Intellectual Property Management and Protection (IPMP) modules – which can be used to deliver content in protected form.

Business models

The first approval of the MPEG-4 standard dates back to October 1998. While MPEG was busy developing MPEG-4, proprietary solutions for audio and video over the web have already been deployed. There are two main business models currently adopted by those proprietary solutions:

1. authoring tools and servers are sold, client is given away for free. In this case software is multiplatform

2. authoring tools, servers and client are all given away for free. In this case software is for a single platform.

Today those who want to be in the business of offering audio and video are forced to choose a supplier with a proprietary technology controlled entirely by the supplier. The evolution of the technology is entirely in the hands of companies that may very well become competitors. When this happens it becomes exceedingly costly to switch from a technology provider to another, because all content developed so far must be converted to a new format, not to mention the user basis which has grown accustomed to a specific consumption device that must be replaced. With the increased attention paid by rights holders to the use that is made of their content, Digital Rights Management (DRM) technology is going to have an increasingly critical role. But these technologies again are very tightly linked to the underlying proprietary media technologies.

MPEG-4 brings to the fore an apparently new approach, because it is the same that has declared the success of Video CD, DVD and digital television. It is also the approach heralded by the Web, because HTML is a standard representation of formatted and hyperlinked text from which the entire WWW has been derived. MPEG-4 is a standard and openly accessible means to encode content and, as much as users investing in web pages know that that their investments will be protected, MPEG-4 users know that their technology investments will be preserved even in the case they decide or are forced to change their technology supplier.


It looks like an ideal solution, but no rose is without thorn. Since decades so many companies have invested and still invest in digital coding of audio and video so that it is hard to build a state-of-the-art coding scheme that avoids using some patents. Because MPEG has always targeted the best solution, the result is that implementing MPEG standards in general requires the use of patents. Firtunately, ISO/IEC rules require that patent rights holders, to have the use of their patents as a necessary condition to implement an ISO/IEC standard, must make a statement that they are willing to license their Intellectual Property Rights (IPR) on fair and reasonable terms and non discriminatory conditions (so called RAND). It is said that implementing MPEG-2 requires about 100 patents. Getting the agreement of several tens patent holders is no simple task. It took about 3 years but eventually a patent pool for MPEG-2 Video and Systems was successfully established.

As for the MPEG-2 case, it has taken quite some time for patent rights holders for the MPEG-4 Visual standard, Simple and Core Profile, to be identified, to establish themselves as a group and to publish a first draft of a license scheme. The scheme published is unusual in the sense that use of patents is charged both per each encoder/decoder and per use on a time unit basis. The latter only applies when there is a flow of money associated with the content being consumed. Reactions to this scheme has been mixed and the industry is anxiously waiting for a final version of the scheme. A different scheme has been proposed for MPEG-4 AAC (Advanced Audio Coding) that is only based on the encoder or decoder and not on its use. It is understood that work is still ongoing for a licensing scheme for MPEG-4 Systems, for other Visual profiles such as Advanced Simple and for other Audio profiles.

Organisations supporting MPEG-4

The MPEG-4 Industry Forum (M4IF) was established in 2000 as a not-for-profit association to promote the use of the MPEG-4 standard. One of the first activities started has been the promotion of the establishment of MPEG-4 related patent pools. However, the actual establishment of the patent pools and the development of the licensing schemes have taken place strictly outside of M4IF. Other M4IF activities concern the execution of interoperability tests between different vendors and the definition of a self-certification scheme.

Another not-for-profit association with close links with MPEG is the Internet Streaming Media Alliance (ISMA). ISMA’s 1.0 specification makes use of MPEG-4 Visual Simple Profile Level 1, MPEG-4 AAC and a simple form of Systems layer.

The community power

MPEG always published software implementations of its standards, but MPEG-4 was the first standard to have a reference implementation in C or C++ or Java with a normative value on par with the traditional text version [5]. This software has been developed with a sort of "open source" process [7]. MPEG makes no claim, however, that performance of the software has been optimised, both in terms of compression efficiency and execution time. The reference software of the Audio part of the standard has been further enhanced thanks to a pool of companies who have funded a professional developer to revise the software.

Anybody can use the reference software and modify it for use in products conforming to the MPEG-4 standard, with the usual disclaimer that use of the software may infringe third party’s patents.

The very existence of this reference software has been of great help to promote use of the standard, as it is known that several products currently available on the market originated from the reference software. Another interesting development has been a revision of some software modules, such as motion estimation, that feature improved execution efficiency by more than one order of magnitude. The latest achievement has been a revision of the MPEG-4 Visual software donated by the MoMuSys project to provide real-time performance.

The latest project in this area is the development of a library of VHDL modules for MPEG-4 Video. The idea is to make these modules available to users under similar conditions as the standard reference software. Unlike the latter, these modules will only have an informative value.

Join the growing list of users

The list of companies who see MPEG-4 as an important element in their media strategy is growing. Just check the M4IF web site at, to see who has joined the association. - Recently M4IF announced that 29 MPEG-4 vendors have successfully carried out three rounds of interoperability tests of products based on the MPEG-4 standard. There are also multiple examples of companies with MPEG-4 products, especially video streaming for mobile in the 2.5 G and 3G environments. One company has successfully established itself as the reference name for movie compression, by using a combination of MPEG-4 Video and MP3. Another company has integrated an MPEG-4 Video encoder in a DV camera. Many other different projects to develop MPEG-4 based multimedia solutions are known to be under way.

Not a static standard

MPEG-1 and MPEG-2 have received very few additions since they were first adopted as International Standards. The dynamics of the Web and other distribution outlets, however, could ill afford basing new businesses on a static standard.

Since the approval of Version 1 (in October 1998) and Version 2 (in December 1999) MPEG-4 has received a number of important extensions. This is a partial list:

  1. FlexTime, a technology providing a means to achieve synchronisation of objects from multiple sources with possibly different time bases, and the linking of one to another in a time graph using relationship constraints such as "CoStart", "CoEnd", or "Meet".
  2. Extensible MPEG-4 Textual format (XMT), a technology providing a dual textual (XML) and binary representation of composition information. XMT acts as a bridge towards two multimedia composition standards: SMIL and X3D from the Web3D Consortium (the new name of the VRML Consortium).
  3. Studio Profile, an extension of the basic MPEG-4 Video syntax for very high bitrate video (both standard and high definition), such as found in studio applications.
  4. Fine Granularity Scalability is a tool that allows small quality steps by adding or deleting layers of extra information. It is useful in a number of environments, notably for streaming purposes but also for statistical multiplexing of pre-encoded content in broadcast environments.
  5. Carriage of MPEG-4 content on MPEG-2 streams defines how MPEG-4 objects can be carried on MPEG-2 Transport Streams for such purposes as to enhance standard MPEG-2 applications with rich MPEG-4 content.

More developments are under way:

  1. Multi-user worlds, a technology that will provide content creators with the ability to create multi-user worlds and applications without being dependent on specific, specialized service providers and users with the ability to navigate multi-user worlds from various vendors with the same tools.
  2. Animation Framework eXtension (AFX), a technology that will provide an effective means to produce and consume rich 3D synthetic spaces.
  3. Advanced Video Coding, the project carried out jointly with the ITU-T to provide enhanced video compression capability.
  4. Audio backward compatible bandwidth extension will provide enhanced audio quality compared to MPEG-4 AAC at the same bitrate, while still being decodable by existing MPEG-4 AAC decoders.
  5. Parametric Audio Extension will provide similar features for MPEG-4 CELP speech codecs.
  6. Lossless Audio will provide a mathematically lossless coding scheme built on top of MPEG-4 AAC.

The value of content

The MP3 phenomenon has taught a number of valuable lessons. One of them is that content in digital form can be replicated so easily that content itself loses its (monetary) value. Many have tried to offer solutions where content is protected, however, users are loath to buy content in that form not so much because there are no-cost alternatives, but because content consumption becomes so awkward. The MPEG-4 IPMP extension aims at overcoming the current limitations of the MPEG-4 "hook" approach and will enable an MPEG-4 decoder to install IPMP tools in a secure fashion, thus providing a higher level of interoperability with protected content.

Even though the activity is taking place in the MPEG-21 project, there is another technology element under development that is called Rights Expression Language (REL). REL will provide a standard way to express the rights associated to a piece of content.

Finding content

Being able to entice prospective users of your service to actually consume is becoming more and more important. This is already happening in the television business, but as the trend of delivering content through more distribution outlets continues, there will be a growing need to be able to attach metadata that are sufficiently generic to be useable in multiple contexts. The MPEG-7 standard, finally approved in July 2001, is the solution to the multi-delivery scenario described.

One of the features of MPEG-4 is Object Content Information (OCI). This can be used to convey MPEG-7 metadata to enable richer forms of content consumption.

A network of standards bodies

Even though few would say that the current partitioning of competences of standards bodies is ideal, not many would dispute the fact that the major standards bodies do play an important role in the technical areas of their competence.

In the past MPEG had to develop its own "Transport layer", because it was necessary to provide a practical solution to those in need of converting their analogue television services into digital. With Internet Protocol (IP) becoming the transport protocol of choice for many media, it becomes necessary to establish practical means to work with the Internet Engineering Task Force (IETF). This is what is currently happening for the carriage of MPEG-4 content over IP networks.

Standards for competition

The reader of this paper may be interested in the business opportunities offered by the wide range of technologies provided by MPEG-4 and associated standards, but the looming question by the keen reader will probably be: "If anybody can have access to these technologies, how can I build a business today that will not be stolen by a competitor tomorrow?". Such a question would betray a view that has been prevailing in the last few years: you have a smart idea, you patent it, you set up a business based on the monopolistic control of your invention and you conquer the world. Fortunately, after the carnage of many high-flying dotcoms witnessed in the last couple of years, this view is rather discredited. What this view has brought is the creation of many different undertakings, all of minuscule size, all addressing similar business areas, all making competing propositions to a non-existing audience.

What MPEG-4 brings is a different view of the multimedia business: instead of trying creating from scratch a business enabled by a proprietary technology with the hope of controlling it monopolistically, first create the business enabled by a common technology. Getting a slice of a non-existing business may be easy, but it is a zero business. Getting a slice of a huge existing business may be difficult, but it is greater than zero.


[1] ISO/IEC 14496-1:2001, Information technology -- Coding of audio-visual objects -- Part 1: Systems

[2] ISO/IEC 14496-2:2001, Information technology -- Coding of audio-visual objects -- Part 2: Visual

[3] ISO/IEC 14496-3:2001, Coding of audio-visual objects -- Part 3: Audio

[4] ISO/IEC 14496-4:2000, Information technology -- Coding of audio-visual objects -- Part 4: Conformance testing

[5] ISO/IEC 14496-5:2000, Information technology -- Coding of audio-visual objects -- Part 5: Reference software

[6] ISO/IEC 14496-6:2000, Information technology -- Coding of audio-visual objects -- Part 6: Delivery Multimedia Integration Framework (DMIF)
[7] L. Chiariglione, Open source in MPEG, Linux Journal, 2001/03