Digital Rights Management, Interoperability and MPEG-21
Leonardo Chiariglione, Convenor

It is part of the MPEG mission to make standards that anticipate market needs. Forecasts are, however, notoriously difficult, the saying goes, especially when they relate to the future.

There are probably few areas for which this saying is as accurate as the area of Digital Rights Management (DRM). It is clear that rights holders cannot simply released their digital assets as they used to in the analogue age. It is also clear that end users are going to resist by whatever means there are all attempts to curtail their freedoms. Then you have differing corporate interests ranging from end-user device manufacturers, to fixed and mobile telecommunication operators, internet service providers and various brands of broadcasters, to technology companies...

Yes, anticipating and harmonising the needs of all these warring interests in a standard is a desperate enterprise. Still, the need is there and backing off just because the problem is intractable is not an option, at least not for MPEG.

In a sense, six years after the MPEG-21 “Multimedia Framework” standard (ISO/IEC 21000) officially started in June 2000, MPEG does not have a DRM solution to offer. Actually the very word “framework” in the title of the MPEG-21 standard conveys the message that, providing a “standard DRM solution” was not even the goal.

MPEG-21 is a collection of technologies (in the following called “tools”) from which designers of DRM solutions and more generally of multimedia systems may wish to draw from. In this paper I will highlight what the MPEG-21 DRM tools are about and how MPEG is integrating them into application-specific DRM solutions called Multimedia Application Formats (MAF). I will also mention briefly the role of the Digital Media Project that has taken a number of MPEG-21 tools, added a few more of its own, and provided an interoperable DRM solution for streaming of governed content.

  1. Since the very beginning of the MPEG-21 project it was clear that the multimedia framework needed a basic technology to package various types of information ancillary to the use of resources (say, the video and audio that end users will eventually enjoy). The foundational element of MPEG-21 is then the definition of a structure – called Digital Item (DI) – that can flexibly accommodate the many components of a multimedia object. This includes, of course, the resources (either in-line or referenced), but also identifiers, metadata, encryption keys, licenses etc. The specification of this structure is provided by MPEG-21 Part 2 Digital Item Declaration (DID).
  2. Another key component of the framework is called Digital Item Identification. This is needed because in the digital space everything needs to be uniquely and unambiguously identified in order to be managed. In the MPEG-21 framework this is function is provided by Part 3 Digital Item Identification (DII), a standard to handle identifiers in Digital Items. Note that this is separate from the issue of identifying resources, for which the process is well established and running.
  3. A Digital Item can contain resources or even portions of a Digital Item that are protected. The component technologies that are needed to handle those resources, i.e. to make them available in a form that can be processed by a device, are standardised by MPEG-21 Part 4 Intellectual Property Management and Protection (IPMP) Components.
  4. The next component of the framework is a technology that will enable devices in the digital space to understand and possibly process licenses in a similar way as humans do with licenses in the real world. The difference is that the latter are expressed in natural language and are valid for a given jurisdiction, while the former must be expressed in a digital form so that it can be processed by a machine. Part 5 Rights Expression Language (REL) provides the technology to express rights in a rich form that is comparable to the richness of the human language.
  5. The REL is capable of expressing the syntax of a rights expression but says nothing about the semantics of the “verbs” (e.g. copy, store, display etc.) that are employed by the language (even though the MPEG REL provides the semantics of a few key verbs). A standard semantics for a large number of verbs commonly used in the media environment in general and their relationships is given by Part 6 Rights Data Dictionary (RDD).
  6. When a Digital Item and its resources are transported over the network it may be necessary to “adapt” (e.g. reduce in bitrate) them to varying conditions or “adapted” (e.g. subsampled) to match, e.g., the device capabilities. MPEG-21 Part 7 Digital Item Adaptation (DIA) specifies the syntax and semantics of the tools that may be used to assist in the adaptation of Digital Items, metadata and resources.
  7. A Digital Item is a static XML structure that contains all elements necessary to describe the resources contained in it. However, a Digital Item does not natively provide a way for a creator to suggest how a user can interact with the Digital Item. Providing this additional information is the scope of Part 10 Digital Item Processing (DIP).
  8. Certain application domains require a technology that can generate a report every time an event occurs, e.g. a Digital Item is processed. The technology achieving this is specified in Part 15 Event Reporting (ER).
  9. There are cases where it is necessary to identify a specific fragment of a resource as opposed to the entire set of data. Part 17 Fragment Identification (FID) specifies a normative syntax for URI Fragment Identifiers to be used for addressing parts of a resource from a number of Internet Media Types.
  10. A Digital Item is an XML structure that can be moved from one device to another “as is”. However, it may be convenient to wrap that structure in a standard file format because in this case a device knows, by virtue of the definition of the file format itself, where specific Digital Item structures can be found. MPEG-21 Part 9 File Format provides a solution to transport a Digital Item in a file. Similarly there is a need to transport Digital Items over a streaming mechanism (e.g. in broadcasting over MPEG-2 Transport Stream or over IP networks). Part 18 Digital Item Streaming (DIS) provides the technology to achieve this when the streaming mechanism employed is MPEG-2 Transport Stream and RTP/UDP/IP.
  11. MPEG-21 also provides a number of other components. The first is Part 1 Vision, Technologies and Strategy, a Technical Report laying down the scope and development plan of the MPEG-21 project. The second is another Technical Report Part 11 Evaluation Tools for Persistent Association (PAT) providing the means to evaluate the performance of a given PAT (i.e. a technology establishing associations between resources and certain metadata related to the resource using such technologies as “watermarking” and “fingerprinting”) in terms of how well the PAT fulfils the requirements of the intended application. The third Part 16 Digital Item Binary Format allows the lossless conversion of a typically bulky XML document to a binary format, preserving the ability to efficiently parse the binarised XML format.
  12. Lastly, and in keeping with the policy that MPEG has consistently applied to all its standards, there are two parts of MPEG-21 that are dedicated to Reference Software (Part 8) and Conformance (Part 14). The purpose of the latter is to provide the necessary test methodologies and suites to be used to assess the conformity of a bitstream (typically an XML document) and a decoder (typically a parser) to the relevant Part of the MPEG-21 standard. Parsers are derived from the Reference Software.

Where are we with the development of the MPEG-21 standard? Rights now only three parts are still under development and only a handful are still being extended. There are talks within MPEG about providing a “presentation layer” for the Digital Item technology but it is fair to say that the bulk of the MPEG-21 work is by now largely completed. So it is appropriate to ask: which are the reasons for the industry to adopt MPEG-21?

  1. It is entirely based on XML (the only exception being the file format), the technology of choice in IT standardisation, so that adoption of MPEG-21 technologies in a number of IT standards can be seamlessly achieved.
  2. It provides a broad and effective set of technologies
    1. The very flexible Digital Item technology
    2. The most comprehensive set of DRM technologies: DII, IPMP Components, REL and RDD
    3. A range of ancillary technologies: DIA, File Format, DIP, ER, FID and DIS
    4. Conformance and reference software to enable a horizontal market
  3. The technologies have been proved to lend themselves to integration
  4. The technologies can be reconfigured for use in a number of environments.

This ability of MPEG-21 to provide technologies à la carte to suit DRM solution designers’ needs is one of the strong points but it is at the same time a weak point. Indeed today there is no such thing as an MPEG-21 “content/product/service”. There is none because there is no such thing as a “universal DRM system”, there are only implementations built to satisfy particular needs.

But things are changing. Since a few years MPEG has been working on a new line of standards identified as ISO/IEC 23000 with the name of Multimedia Application Formats (MAF). Unlike all other MPEG standards that provide tools, MAFs are complete solutions designed to satisfy specific application needs. MAFs are typically made up of MPEG technologies and only occasionally non-MPEG standards become part of the cocktail.

The first MAF standard is Part 2 of ISO/IEC 23000 and is called Music Player MAF (MP MAF). It essentially defines a file format based on MPEG-21 Part 2 DID and MPEG-21 File Format containing audio resources encoded in MP3, photographs encoded in JPEG and ID3 metadata encoded as MPEG-7. The MPEG-21 DID technology enables the definition of attractive multimedia packages that humans call “music albums”. The MP MAF is currently being extended with the inclusion of DRM tools drawn from the MPEG-21 toolkit: basically IPMP components and REL. the resulting file format can be used as the subject of transactions for, say, music acquired online on the web. The Protected Music Player MAF has just been promoted to Committee Draft (1st stage of public inquiry) and will reach the final approval stage in April 2007.

At the July 2006 meeting MPEG has received a proposal from the Digital Media Project (DMP) entitled Media Streaming MAF (MS MAF). This is the result of a process initiated in July 2003 that spawned the Digital Media Manifesto ( and then the Digital Media Project itself (, a not-for-profit organisation based in Geneva with the mission to promote continuing successful development, deployment and use of digital media.

DMP has produced two sets of specifications, the latter of which, called Interoperable DRM Platform Phase II (IDP-2) and available at, provides most of the technologies that are needed to set up value chains handling governed content in a streaming environment. A sizeable part of the technologies specified by IDP-2 are based on MPEG-21 and basically only those functionalities that are not supported by MPEG-21 (e.g. domain management) are DMP native.

The DMP specification is itself a toolkit. However, unlike MPEG-21, the tools have already been integrated. What a user of the specification needs to do is to tailor IDP-2 to his specific needs. Chapter 4 of IDP-2 provides a number of use cases where this tailoring process is carried out for a range of specific applications.

DMP has tailored IDP-2 when it has proposed MS MAF to MPEG. The proposal covers the specification of the “streaming format” of the information reaching a “Media Streaming Player (e.g. a set top box) and the protocols exchanged between a Media Streaming Player and a Content Provider Device, a Licence Provider Device, a DRM Tool Provider Device and a Domain Management Device.

A working draft of the MS MAF standard, based on the DMP proposal, is already available. The final stage will be reached in October 2007 as Part 5 of ISO/IEC 23000.

More MAFs are being considered by MPEG. One of them is the Portable Video MAF (PV MAF), conceptually the equivalent of the Music Player. Another is the Open Release MAF (OR MAF).

It is worth spending some time with the latter MAF because the case is indicative of the flexibility of the MPEG-21 toolkit. Indeed, while the MPEG-21 tools can be used to design DRM solutions that can satisfy the most concerned of rights holders, if properly implemented, the same tools can be employed to develop solutions that have much less stringent security requirements.

Imagine that Eurydice, the author and performer of a song, is interested to release it digitally with, say, a Creative Commons (CC) licence, itself digitally represented. This is already possible today (there is an RDF representation of the CC licences), but imagine that Eurydice decides to express using the MPEG-21 REL because of a variety of reasons, such as

  1. ease of processing with IT devices
  2. ability to “say more” in future licences while keeping the same right expression technology
  3. native use of the MPEG-21 Digital Item technology
  4. etc.

This scenario entails a number of problems that MPEG is currently addressing:

  1. while the CC licence has a legal value when expressed in the form required for a specific jurisdiction, the licence expressed in REL cannot achieve the same goal (it can only represent the “intentions”)
  2. The CC licence grants users their “fair use” (in the jurisdictions where this legal figure is supported) of the content. Even though it would seem that a device that “enforces” the terms of the licence would contravene the human-readable CC licence, it is clear that the user would have an easy way to bypass the device restrictions because the resources are unencrypted

As clear from the above the MPEG-21 standard is still on the launching pad for broad adoption. There are well-known competitors both in the proprietary solutions camp and in the standard solutions camp, but I am confident of the eventual success of this standard because the world craves for flexible DRM solutions that are:

  1. Flexible, i.e. users can build any value-chain, suiting their own business models and without strings attached from alien business models
  2. Cost effective, i.e. users can draw from a horizontal market of standard technologies and solutions
  3. Evolvable, i.e. users can easily expand the functionality of their value chain
  4. Interoperable, i.e. users can interoperate between value chains made of the same basic technologies