Paul W. Foster
Ipswich, IP5 3RE, United Kingdom
phone: +44 1473 648473
Large scale multimedia services have great need of content management systems. Metadata has a key role to play in allowing such systems to be built and automated. A management system composed of multiple distributed components is described. Each component implements a service such as content validation, transcoding, search or generation; and relies on metadata information to achieve success. Metadata specification, security, transport, and storage needs have been identified through the implementation of prototype components. Although the technology employed in each of the management components is often straightforward, the impact on organisational processes and interworking may be significant.
The world is in the midst of an information explosion. Multimedia information services are now delivered to consumers via interactive digital television and radio in the home, touch screen kiosks and payphones on the street corner, personal computers on the desktop, and web browsers built into the latest mobile phones. These consumers are demanding high quality services constructed from multimedia content that is timely, relevant, accurate, personalised and cheap. In the business environment the accuracy and timely arrival of information is often a vital ingredient of success and businesses demand access to this information for their employees while on the move, again from a variety of access terminals.
For those in the business of providing multimedia services, an efficient, scalable and automated content management system is essential in order to achieve high customer satisfaction and profitability. Content management promotes content repurposing and reuse leading to direct financial savings. It also provides a mechanism for ensuring the quality and consistency of content and a route to content personalisation. The separate markets of internet provision, entertainment in the form of movies, games and magazines, and broadcast, satellite or cable television are all converging to offer the consumer information on demand services. As an example, it is possible that soon everybody will be able to have their own personal television schedule that best suits their lifestyle and television channels will no longer offer a traditional fixed schedule.
This paper details work carried out at BT (British Telecommunications plc) into the provision of content management systems and explores the technical and organisational challenges that must be overcome. Metadata is a key tool used in the implementation and development of such systems. Therefore, this paper suggests several innovative ways in which metadata can be managed and manipulated and emphasises the practical use of metadata to solve the content management problem.
BT plays a vital role as an information service provider in today's multimedia service industry. With global reach and vast expertise in building very large scale communication systems, the company is expertly placed to be able to deliver multimedia content to the widest possible audience. The branding of these services is a very important issue for BT. While the company itself may not be the author of the content of a service, the end user of a BT multimedia service perceives all content delivered to be stamped with the BT brand. Therefore, the company must ensure all services are of the highest quality. Significant experience has been gained from the real life content management challenges posed by multimedia services as diverse as interactive television , multimedia kiosks, intranet television  and personalised news delivery.
With two thousand paying subscribers and 2.2 terabytes of on-line material, BT’s interactive television trial remains the largest of its kind to date. In order to satisfy customers’ demands for constantly updated material, it took more than twenty people to manage the content, manually transferring material from video and data tapes onto the system. This process also proved error prone, relying on handwritten labels on tape boxes to ensure that material was placed correctly on the system. This level of manual intervention makes it impossible to scale up the service for profitable delivery nationwide.
The multimedia kiosk and video streaming projects have highlighted similar requirements for automated content management. The initial absence of any automated content management processes in the trial services greatly hampered efforts to quickly roll out robust, full scale services without significantly increasing manpower. For example, multimedia kiosks were initially placed in the London area and all served identical information about local services, news, weather, etc. If the kiosks are to be placed nationwide, then specific information must be targeted to each kiosk, or group of kiosks, in order to satisfy local customer needs. On the trial kiosks, most content was updated on a monthly basis. The trial results indicated that this churn rate would need to be increased to at least weekly to satisfy customer requirements, leading to impossible strains on manual content handling.
In order to be effective, a content management system must manage a piece of content from the moment of its creation to its final destruction. In between, the content can be included in various applications and service offerings, held in content archives or even modified, creating multiple versions of the content all of which need to be managed. Therefore, the system must be distributed across many different organisations - content providers, application builders, network service providers and even the end consumer. The system must be flexible enough to be implemented by both small and large multimedia services and easily scalable for service expansion. The system must also be straightforward to use, with little or no customisation needed for use by different services. These requirements lead naturally to a system composed of many different distributed management components running in a co-operative environment. Metadata is the key element that unites these services together.
Metadata is information associated with a piece of data. The data is often media, such as a text file or a video clip, but could also be a person's profile information, network status information, or quality of service information. The metadata usually takes the form of a set of attribute value pairs. Typically the value is thought of as text, but we are also using image or audio content as metadata values. For example, a still image is used as a value for the metadata attribute publicity poster associated with a piece of video content. We use the word 'content' to refer to the composite object made up of data plus metadata, or the data by itself if no metadata exists for that data.
In order to achieve automatic content handling, two different aspects of the metadata must be standardised. First, there must be an agreed structural format for the metadata so that processes can access and decode each metadata attribute value pair. W3C's eXtensible Metadata Language (XML)  has quickly become the leading contender and at BT we have successfully used the XML format in all our prototype developments. Second, the meaning of the various attribute terms must be agreed upon, so that content processing can be processed intelligently. For example, the author and creator attributes are often used equivalently, or the attribute date could mean creation date, expiry date, date of birth etc. The processing component must be able to interpret the metadata vocabulary correctly given its current context. There has been significant work done by various industry bodies to propose domain specific metadata vocabularies e.g. IEEE Learning Objects Metadata for educational material , Information and Content Exchange (ICE) to promote e-commerce and information exchange between companies  and Synchronized Multimedia Integration Language (SMIL) for multimedia applications . The W3C's Resource Description Framework (RDF)  allows these vocabularies to be specified, identified and referenced from an XML file.
During the development of the prototype content management system we have had to specify our own metadata vocabularies for describing content. From an early stage we saw metadata as being divided into categories, each associated with a different vocabulary. Each management component generally requires and manipulates only a subset of the available metadata, and needs to understand only this small part of the metadata. Therefore each component specifies a small vocabulary which must be complied with by the content metadata in order to avail of the service. As the system matures we expect to publish our metadata vocabularies for each of the content management components so that others may use them, and also to adapt our components to take advantage of the most widely accepted of the emerging standards.
We quickly learnt that in order to be effective, metadata must be extremely flexible. There are times when the storage and bandwidth restrictions of a multimedia service require that a piece of media is compressed as small as possible. Therefore, it must be possible to physically separate media from its metadata, temporarily or permanently. Conversely, quality in other parts of the service will suffer if, instead of handling one media file, it is necessary to handle both the media file and an associated file containing its metadata. In order to reduce the size of content items we have experimented with replacing metadata information with URLs that point to where the information is actually stored. This also allows copies of the media to share one physical instance of the metadata information and can provide excellent control for the maintenance and rights management of such information. Similarly we have merged media and metadata file formats so that one physical file represents both parts of the content item. We look to the developments in MPEG-7  to provide standardised long term solutions.
In the future we are seeking to give content the ability to be 'self aware'. One illustration of this is adding to a content's metadata throughout the its lifetime, whereby the content knows where it has been, which applications it has been used in, how often the content has been copied, who has paid for a copy, etc. This type of information can be invaluable when planning and evaluating new multimedia services. It is also important to provide methods for securing metadata information so that it cannot be tampered with, and in turn, metadata information can also be used to control access and copying privileges.
Currently, each multimedia service offered by BT is developed with its own individual system for managing content. This results in significant wasted effort in reinventing and rewriting management systems, as well as preventing the reuse of content between services. We are designing and building a content management system, shown in Figure 1, that can be deployed company wide and utilised by multiple multimedia services. The new management system will significantly reduce service start-up costs, as each new service will be able to take advantage of existing content management processes. Content sharing and reuse is greatly simplified through the use of common interfaces, and any improvements made to the content management processes are immediately available to all services.
The content management system itself is composed of a set of distributed components, each providing a particular function such as content tracking, metadata generation, content storage, or content validation. Each component has been designed to take advantage of as much automated processing as possible, since this further reduces the cost of providing services. The designers of a multimedia service can choose to utilise only the components that best meet their needs. In the future, competing components may be introduced in each category and many more specialist components will be created.
We have designed and built prototypes of the key components that make up our content management system. These are briefly outlined in the following sections. Particular emphasis has been placed on the strategic role that metadata plays in the functioning of each component. Other design and implementation issues have been addressed in more detail elsewhere .
The most important part of any content management system is the process by which metadata is associated with data items. We have developed a tool called MEX2 (Metadata EXtraction Version 2.0) that provides a user friendly interface for the semi-automatic creation of metadata. Automatic analysis of the media allows some metadata to be created automatically, for example the dimensions and colour characteristics of an image, or automatic text summaries. However, since the intelligent interpretation of media by machine is not yet capable of reliably answering common questions that users ask about media, the tool allows a human operator to enter additional information. Figure 2 shows the tool being used to add metadata to an image, and Figure 3 shows the use of an automatic audio analyser that distinguishes speech, music and noise. The tool was originally intended for use by a media author for the initial generation of metadata, however it was soon realised that it could also be used for the later addition and editing of content. The tool is also built in modular form to be easily extensible. New media formats can be added, and new media interpretation algorithms can be added. Since much of the media content used in BT’s multimedia services is supplied by external agencies, MEX2 can be deployed and maintained at suppliers' external sites. In this way, BT hopes to influence its suppliers to supply media already tagged with metadata information in a standard format.
MEX2 allows a user to store metadata in a variety of formats, including CMSL  and XML. If XML is chosen, the user can also control the choice of XML DTD which serves to validate the metadata structure and ensure consistency. GESTALT  is a European Community sponsored project looking at remote learning environments, providing services such as resource discovery, student profiles and asset management that are required for students and lecturers to work remotely over the internet. By associating metadata with the educational material, these services become easier and more efficient to implement. BT is involved in the design of an XML DTD to represent the metadata according to the IEEE LOM standard, and MEX2 can then function as an input tool for the environment.
Validation is the task of ensuring that the media content of a service is correct. There are two items needed for validation, one is an unknown piece of content and the other a set of criteria against which that piece of content can be compared. There is a wide variety of types of questions that a validation service might be expected to answer. The most important of these concern the content of the content. For example, is this the right graphic for the accompanying text, or, is this the right episode of the soap opera for broadcast tomorrow. A validation service can also be used to ensure that the content is complete and uncorrupted. For example, is this video of the right length and resolution for display on the current client. Other useful features can include checking the content for any breach of copyright or unauthorised alteration to the original.
In order to achieve a validation service that is functional in large scale multimedia services, validation must proceed with minimal manual intervention. It is not economically feasible for a human to check every piece of content placed on a BT service. The ultimate reference against which to judge the abilities of any computer based validation process is a person who inspects each piece of content manually. It is important to remember that a human faced with a new piece of content of which he or she has no prior knowledge cannot learn everything about it by examining the content itself. For example, a person may find it impossible to tell the location at which a photograph was shot simply by looking at it. Therefore, we use metadata information to describe both the content and the validation criteria.
Our validation service has four parts. First, syntactic validation checks the technical correctness of the content, for example that media is in the correct format, or that all links in a web page are valid. Second, semantic validation determines if the content matter is correct in its current use context. For example, is this the correct image to illustrate this magazine article. The answer to these questions are supplied by the content metadata, either generated automatically using the techniques first utilised in MEX2 or else supplied by a human user. Intelligent media processing will slowly replace the human user. For example, the sound analyser can quickly indicate that a track supposed to be music, is in fact people talking. The third part adds additional metadata to the content to record the results of the syntactic and semantic validation. This is particularly important if a human user has contributed to validation, since the same questions can then be answered automatically in future. The final part is to secure the content for onward travel, using encryption and digital signature techniques. This means that if the content is transferred to another organisation or process, the metadata can be trusted and potentially the content used without any further need for validation. We suggest that validation is carried out after each stage in the content lifecycle, as shown in Figure 4. The 'validate early, validate often' approach allows a validation service to build upon and extend validation done earlier in the cycle, giving more efficient use of resources, faster validation, and minimal waste from revalidating content. An application programmer is often in the best place to validate and label the application content, the service provider then only has to test that the application runs correctly, without worrying about the individual content items inside the application.
Universal content is a multimedia service designed to pre-empt the needs of each user and automatically supply the information the user actually wants without prompting, tailored to the user’s current environment. This is represented in Figure 5. A user on a low bandwidth mobile phone is not (yet) able to view a high resolution weather map image; instead a simple text message with ‘Rain in afternoon’ will usually suffice. The different requirements of each user create a problem for the authors and distributors of information. Typically, the author must either limit the types of user and client technology that can access the information, or else create significantly different versions of the information targeted to different user groups. This requires extended manual effort that is impossible to provide in large scale systems. Even authors of a simple web page need to remember that the page must be able to be viewed by users on web-tv, personal organisers, etc. Currently the problem is being solved for these clients by the clients themselves, with special translation functionality built into web-tv systems, and specialised HTML to WML (wireless markup language) translation servers. This approach is simply not scalable.
We have implemented a prototype universal content service that modifies web pages based on user preference and client capability. The service is composed of three parts. The first is responsible for identifying the characteristics of the client environment - environmental metadata. Information about a user’s personal preferences, location, client type, and network connection are kept in a set of flexible profiles, built using XML and ideas from the Open Profiling Standard .
The second part is a metadata schema that allows content to be tagged with additional information and media, so that alternative representations of the content can be built up. The prototype implementation we have built gathers this metadata information in two ways, first by prompting the author to include additional information during the authoring stage, and second by automatically analysing the content of a page and providing alternatives. For example, a video may be automatically replaced by a text summary of its audio track. Using only the latter approach, it is possible to provide real time universal content service for all existing web pages. However, information added at the authoring stage significantly improves the quality of service that can be provided. An example of tagged content is shown in Figure 6.
The remaining component is the universal content engine, responsible for examining both the user environment profile and the requested content, and then deciding what content selection should actually be transmitted to the user. This may involve choosing between content alternatives specified in the metadata information, or of requesting a new item of derived content to be created automatically. The engine operates as a proxy server in the network, filtering information flow between provider and user.
The most important asset of any content management system is its ability to locate content in response to a user query. For example, if a designer of a new service wishes to use an image of the BT logo, then all qualifying images must be returned by the system in answer to a straightforward query containing the phrase ‘BT logo’. In order to achieve content repurposing and reuse, all content must be made available to other multimedia services. However, it is also important that each system is able to retain some autonomy and flexibility in providing content storage since this is often chosen to reflect quality of service requirements. Therefore, each service may choose to store its content differently, perhaps in a database, directly in a flat file system, or on tape archives for content accessed less frequently. The content storage and retrieval component serves as a directory to locate content stores and provides a common, user friendly query interface.
As yet, there is no general agreement on standards for how media and its associated metadata should be stored in a database or file system. At least in the short term, it looks likely that each data store will need to publish information on how to translate stored content into the common metadata format understood by the entire management system. This is the approach we have taken in our design, where the content retrieval component relies on each data store publishing instructions on how to access and interpret the content store as a collection of records in XML format. We anticipate that, in the near future, we will be able to adopt the W3C’s standard XML query language as the common query interface for our system. Each data store will need to provide an interface to this query language only if native XML support is not available.
Research has shown that the most common queries asked by users of retrieval systems contain proper names, e.g. Get me an image of the Eiffel Tower, or Get me an audio clip of President Clinton speaking . These queries remain impossible to answer accurately without manual intervention unless the media has metadata associated with it. Therefore, our retrieval engine will not initially attempt to incorporate many intelligent media retrieval techniques, such as image matching, since they are of only limited benefit. Of more importance, will be to concentrate on incorporating knowledge management techniques to interpret user behaviour and build up user profiles and preferences in order to increase retrieval accuracies.
The provision of content lifecycle control mechanisms is fundamental in allowing all the components of the management system to inter work. The content control component authorises the movement of a piece of content between components and ensures that content is in the right place at the right time. BT’s experimental content control system is known as CATS (Content Automation Tool Set)  and it has been very successful in demonstrating the automatic ordering, delivery, launch and deletion of content for a multimedia service. A schematic of the control process is shown in Figure 7.
The designer of a multimedia service begins by creating empty content items, effectively metadata templates, that represent the media required by the service. A content item can itself be a grouping of other content items, perhaps representing an entire application. Metadata information, such as a description of the required media, is added to the content item. It might also contain constraint information, for example any image should be less than a certain size, or information on authoring deadlines, as well as content launch and expiry dates. This content item is entered in the system and automatically sent to the agency responsible for media creation, where the content is filled with the media and extra metadata information such as the price, origin, or detailed description. The content is then returned to BT’s content management system where it undergoes automatic content validation and is routed towards the content store for a particular service. The content lifecycle control component is responsible for automatically notifying the service designer if the content fails validation, or fails to arrive at BT on time. When all content items required for a service have arrived, they can automatically be added to the service, and then removed after a certain length of time as specified in the metadata. The control component also allows for manual processes to be inserted into the lifecycle, such as a manual check of the application before it is released.
The control of the content lifecycle is the component most likely to need customisation for use by individual multimedia services. Therefore it has been the most difficult to design. Our efforts to date have focused on automating the publication of web sites, where it has been relatively straightforward to obtain the media requirements of the site, write these requirements in the form of metadata for a set of content items, and automatically supervise the delivery of content and then the bulk download of the web site material for publication.
As the demand for new multimedia services grows, and the variety of multimedia clients in use increases, a content management system must be implemented for all large scale multimedia services. Service providers need to be able to reuse both the content and the content management system. It is not economically viable to manually control content flow from creator to end user, and besides, manual intervention does not necessarily mean better quality. Better content management also means that the content can be found and used again, or traded for commercial gain. The enabling mechanism for automating and improving content management systems is metadata.
XML has emerged as a leading contender for a standard metadata format, but XML only addresses the physical format in which the metadata is stored. It enables applications to use standard methods to access metadata values, but by itself, does not aid automatic understanding of content. It is not possible to implement content management processes such as workflow control, universal content, or validation without also standardising on metadata vocabularies. All the content processes described in this paper depend on this standardisation happening, at least on a company wide basis. Therefore, we have developed our own metadata vocabularies and functionality for use with our content management components. In the future, these components will have to be altered to work with other vocabularies. A metadata translation and interpretation component could fulfil this requirement.
At BT, we are very aware of the time and financial pressures under which multimedia services are developed and launched. Content management can seem unimportant for trial services, and it is only when services are prepared for full deployment that the content handling crisis rears its head. Therefore, as well as research into content management systems, we have spent significant amounts of time talking to service designers about the need to consider content management from the beginning of the design process. Adding content management often means changing work processes, with more attention paid to data before it is launched on a service. It also means changing relationships with content suppliers, who will be expected to supply media together with its associated metadata, and to interwork with automatic workflow control processes. We cannot realistically expect every service to immediately begin using a fully automatic content management system. Instead we are pursuing a twofold approach to achieve this end goal. First, we have implemented prototype components that can be provided on a common platform shared by all services. This has allowed us to explore the metadata requirements for content handling. The components design has already proved itself successful - metadata generation and client personalisation services could be quickly added to an electronic news application by reusing MEX2 and adding the universal content proxy service to the application. Second, we are encouraging all services to add metadata to their content immediately, and as much metadata as possible. Where possible, content should be supplied by the creator with metadata already in place. This is one reason why we have emphasised the importance of a metadata generation tool. The presence of basic metadata, even in non standard formats, makes it much easier to reuse content, to include automatic content management as service development requires and adds immediate value to the content. One priority for the future is to ensure that our recommendations for the use of metadata evolve in line with emerging standards.
 G.W. Kerr. A Review of BT's Interactive TV Trials. IEE Colloquium on Interactive Television, London, October 1995.
 A. Grace, R. Jacobs, J. Cox and A. Barrow. Intranet-TV - Video Streaming for the World Wide Web. BT Technology Journal, Vol.17, no.1, January 1999.
 W3C Recommendation. Extensible Markup Language (XML) Specification 1.0. URL http://www.w3.org/TR/REC-xml, February 1998.
 IEEE Learning Objects Metadata URL http://grouper.ieee.org/groups/ltsc/wg-c.htm.
 W3C Submission. The Information and Content Exchange (ICE) Protocol. URL http://www.w3c.org/Submission/1998/18/, October 1998.
 W3C Recommendation. Synchronized Multimedia Integration Language (SMIL) Specification 1.0. URL http://www.w3.org/TR/REC-smil, June 1998.
 W3C Proposed Recommendation. Resource Description Framework (RDF) Model and Syntax Specification. URL http://www.w3.org/TR/PR-rdf-syntax, January 1999.
 ISO/IEC JTC1/SC29/WG11. MPEG-7 Requirements Document Version 7. URL http://drogo.cselt.it/mpeg/public/w2461.html, October 1998.
 K. Curtis and O. Draper. Multimedia Content Management - Provision of Validation and Personalisation Services. To appear in proceedings of IEEE Multimedia Systems '99, Florence, Italy, June 1999.
 P. Foster and R.Gepp. Content Metadata Specification Language Version 3.0. BT Internal Report, URL http://www.labs.bt.com/people/fosterpw/CatsIntro/Default.htm, September 1997.
 GESTALT: Getting Education Systems Talking Across Leading-edge Technologies URL http://www.infowin.org/ACTS/RUS/PROJECTS/ac367.htm.
 W3C Note. Proposal for an Open Profiling Standard. URL http://www.w3.org/TR/NOTE-OPS-FrameWork.html, June 1997.
 M Markkula and E. Sormunen. Searching photos - journalists' practice in pictorial IR. The Challenge of Image Retrieval Workshop, Newcastle, United Kingdom, February 1998.
 P. Foster, S. Banthorpe and R.Gepp. Automating the Multimedia Content Production Lifecycle. Multimedia Applications, Services and Techniques - ECMAST '98, May1998.
Copyright 1999 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.