| Riding the Media Bits | chiariglione.net | |||||||||||||||||||||||||||||||||||||||||||||||||
|
Tagging information |
|
|||||||||||||||||||||||||||||||||||||||||||||||||
|
Last update: 2005/03/08 |
||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||
| A key technology to add textual descriptions to other forms of data. | ||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||
|
The form of communication enabled by the 26 letters of the Latin alphabet is very effective but in general it leaves out a wealth of other information that was present in the original (multimedia) message. If the sequence of characters is the transcription of the TV interview with a politician, the text will miss the inflexion of his voice, the grimness on his face and the body gesture that may actually bring more information than the words themselves (usually quite meaningless). To cope with this limitation, over the centuries people found it necessary to add a number of characters or combination of characters, such as !, ?, …, !?, etc. to the 26 original letters to make the message richer and its interpretation more meaningful. Other conventions have also been used, such as writing words in capital letters, underlining or writing them in bold or italic. Particularly with the advent of the Internet, emoticons like :-) have become quite popular. One way to schematise the above is to separate the content of a message into two parts: what can be expressed with characters and the rest. This is of course a very "character-centric" view of the world, that belies the attempt to build the complexity of multimedia communication from the bottom up, starting with characters. This approach espoused by computer scientists draws its motivation from the fact that characters have already been integrated in computers. There is a philosophical basis to this. Saint John's Gospel starts affirming that "In the beginning was the Word", where word is logos in the Greek version of the Gospel, so it can be interpreted that everything started from rationality. Maybe (not the Gospel, the interpretation), but this is not what we experience in our daily life. The rationalisation of the world that gives rise to our words is a constant effort at minimising the impoverishment, if not the distortion, of reality that our words represent. The separation advocated by computer scientists may have grounds in the Latin alphabet, but is largely lost when people communicated using Chinese characters where the very way the writer of the message uses his brush adds more about his feelings, or in a message written in Japanese where the very fact that certain Chinese characters have been used instead of hiragana or katakana adds information. Back to technology, markup is the name given by IT people to information that is "additional" to text. A human can use it to have a better clue as to the real meaning of the words pronounced by another human, a computer can use it to perform appropriate processing and a printer can use it to present some text in bold to catch the reader's attention. One of the reasons we have hundreds of printer drivers in our computers is because every printer uses its own special codes to make titles appear large, bold and centered, make paragraphs of a certain width with a bullet and an indent, and so forth. The situation is not so much changed from the day linotypes were in use. But that situation was understandable, if not commendable, because linotypes were closed machines with no need or intention to let them communicate with other machines. This behaviour was nothing else but the continuation of a practice that dates back centuries ago when markup codes were used in manuscripts to give instructions to typesetters. The markup codes were meaningful only in the industry in which they were used, maybe even specific to one particular publisher. In the early 1980s TC 97 of ISO started working on a markup language that eventually became the Standard Generalised Markup Language (SGML) or ISO 8879:1986. An SGML document is composed of content - made up of characters - and markup - made up of markup characters. To distinguish between the two types of content, SGML inserts delimiter characters to indicate markup information. Two commonly used characters are open ("<") and closed (">") angle brackets. A tag is then expressed as <anything>, the <> characters being the delimiters and "anything" being the markup code. The software processing the document will then know that the characters between "<" and ">" should be read in TAG mode, while the others should be read in CON (i.e. content) mode. At the beginning, the group developing SGML thought of defining a set of universal tags. With this idea, once, say, "P", "BR" and "H1" would be standardised, <P> would always mean new paragraph, <BR> would always mean a breaking point and <H1> would always indicate a first-level heading. This is the usual dilemma confronting those developing IT standards: should we have something that is of immediate use, solving at least the most basic communication problem, or something that just gives the general rules that everybody can then customise? In the IT world the answer is regularly the latter, because "if there is something that I can do immediately, why should I share it with my competitors and create a level playfield"? SGML was no exception and it was so decided that SGML would not contain a set of standardised codes, but just a language that could be used to create a Document Type Definition (DTD). The DTD would define precisely the tags that would be used in a specific document. SGML has been, in a sense, a successful standard, but only in closed environments, like some major printing organisations. No way such a complicated arrangement would work for the mass market in which the companies that developed Ventura, Word, WordPerfect and Wordstar, to name a few, battled for years. The SGML arrangement, however, did not suit Tim Berners-Lee when he needed a simple way to format web pages. He developed HTML, a simplified form of SGML that uses a standard set of tags that are understood by all browsers. This shows that when IT people address a mass market, they know what they must do. I bet that if he had introduced HTML from the beginning without any standard set of tags, we would not have the billions of web pages that we have in the world today. The basic structure of an HTML document is <HTML> From this example one can see that an HTML document is contained between the pair <HTML> and </HTML> and that an HTML document consists of two main parts: the Head, and the Body, each contained between the pair <HEAD> and </HEAD> and the pair <BODY> and </BODY>, respectively. The Head contains information about the document. The Head element that must always be present is the <TITLE> tag. It is the one that appears as a 'label' on the browser window. A tag that may appear in the HEAD part is <META> and this can be used to provide information for search engines. The Body contains the content of the document with its tags. Imagine now that I want to create a document that contains the centred and bold title "Address list" and two elements of the list, like this one: Address list Employee ID: 0001
Employee ID: 0002
In early versions of HTML this is represented as <HTML> In this HTML document the pair <CENTER> and </CENTER> indicates that "Address list" should be displayed as centred, the pair <B> and </B> indicates that "Address list" should be displayed as bold, the pair <UL> and </UL> indicates that a bulleted list is included and the pair <LI> and </LI> indicates an item in the list is included between the pair. <P> instructs the interpreter to create a new paragraph. In the first phases of the web evolution, the IETF took the management of HTML as a communication standard on board, but did so only until HTML 2.0, known as RFC 1866. In the meantime wars were being waged between different business players to control the browser market. Each company added its own tags to the language that would only be understood by their browsers, but not by competitors' browsers. The latest version is HTML 4.0 that has about 90 different tags. Extensible Markup Language (XML) is a derivation of SGML, if not literally, at least in terms of design principles. It is possible to have an XML document without a DTD (as is the case of HTML) because with XML one can define tags to match a specific domain as in SGML. An XML element is made up of a start tag, an end tag, and data in between. The start and end tags describe the data within the tags, which is considered the value of the element. Using XML, the first employee in the HTML document above could be represented as <EMPLOYEE> The meaning of <EMPLOYEE>, <ID>, <NAME>, <EMAIL> and <PHONE>, otherwise obvious to a human reader, are meaningless to a computer, unless it is properly instructed. And this is the role of the DTD. The combination of an XML document and the accompanying DTD gives the "information representation" part of the corresponding HTML document. It does not say, however, how the information should be displayed because this is the role played by a style sheet. Style sheets can be written in a number of style languages such as Cascading Style Sheet Language (CSS) or eXtensible Style Language (XSL). A style sheet might specify what a web browser should do to be able to display the document. In natural language:
XML separates "information representation" from "use of the decoded information", while the two are bundled in HTML. In a sense this separation of "representation" of information from its "presentation" is also a feature of MPEG-1 and MPEG-2, because decoders interpret the coded representation of audio and video streams, but the way those streams are used, i.e. presented, is outside of the standard and part of an external application. With XML the "external application" may very well be an HTML browser, as in the example above. The MPEG-4 Systems case is different in that "composition information" is carried by BIFS. On the other hand, MPEG standards do not need the equivalent of the DTD. Indeed the equivalent of this information is shared by the encoder and the decoder because it is defined by the standard itself. It could hardly be otherwise, because XML is a very inefficient way of tagging information - in terms of number of bits used - while the video codecs and the multiplexes are designed to be extremely bit-thrifty. XML is a W3C Recommendation. The work that eventually produced XML started in 1996 with the idea of defining a markup language with the power and extensibility of SGML but with the simplicity of HTML. The Version 1.0 of the XML Recommendation was approved in 1998. The original goals were achieved, at least in terms of number of pages, because the text of the XML Recommendation was only 26 pages as opposed to the 500+ pages of the SGML standard. Even so, most of the useful things that could be done with SGML, could also be done with XML. W3C has exploited the description capabilities of XML and has constructed Synchronized Multimedia Integration Language (SMIL). Like an HTML file, a SMIL file begins with a <smil> tag identifying it as a SMIL file, and contains <head> and <body> sections. The <head> section contains information describing the appearance and layout of the presentation, while the <body> section contains the timing and content information. This is the functional equivalent of composition in MPEG-4 Systems. XML inherited DTDs from SGML, but it has become apparent that some shortcomings were also inherited, such as the different syntaxes for XML and DTD requiring different parsers, no possibility to specify datatypes and data formats that could be used to automatically map from and to programming languages and no set of well-known basic elements to choose from. The XML Schema standard improves on DTD limitations. It creates a method to specify XML documents in XML and includes standard pre-defined and user-specific data types. The purpose of a schema is to define a class of XML documents by applying particular constructs to constrain their structure. Schemas can be seen as providing additional constraints to DTDs or a superset of the capabilities of DTDs. |
||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||
|
Send
an e-mail to comment
See
the communication policy
|
||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||
|
Copyright © 2003 chiariglione.net |
||||||||||||||||||||||||||||||||||||||||||||||||||