Creating a Content Inventory

Abstract

This document describes the purpose and general method of creating a content inventory, use of tools in the process and some examples.

Intended Audience

This document is intended to inform leaders, information architects and senior management embarking on a content management project.

Summary

A content inventory is a structured database of content resources containing information that will be analysed to inform decisions made about the management of content within an organisation.

A content inventory should be designed to provide specific information required to satisfy the clearly stated analysis objectives of the project. A well executed content inventory, and analysis based on the information it contains, will give decision makers an overview of the state of content within the organisation.

Objective

There are many reasons for creating an inventory of content within an organisational group. The purpose of the inventory will influence the contents and extent of the inventory. As with any analysis project, a clear definition of parameters at the outset will lead to more efficient creation of the inventory and more useful inventory deliverables. Some possible objectives of a content inventory are listed below.

  1. To understand current content management requirements
  2. To project future content management requirements
  3. To more effectively exploit existing content resources
  4. To audit content security with an organisation
  5. To audit the accessibility of content to its intended audiences

General Method

The process of creating an individual content inventory is designed to achieve the stated objectives of the project. The following process is a framework only and it is unlikely that any actual project would follow this process completely or that the deliverables would contain all of the information described.

There are three key stages to the creation of the inventory. The output of the first stage is the clear communication of the parameters of the project to the project team.

Parameters of the Project

  1. Stated objective (e.g. To understand the volume, nature and distribution of content within our organisation so that a content management system may be procured to facilitate its more effective use.)
  2. Expected deliverables
  3. Schedule
    1. Commencement date
    2. Duration
  4. Budget
    1. Internal resource
    2. External resource
High Level Content Survey

The initial output from the second stage of the project is a high level description of the modules1 of content within the organisation containing enough detail to satisfy the objectives of the project or to facilitate a detailed audit. Additional outputs from this stage may include a mapping or summarisation of the content according to the surveyed attributes.

  1. Enumerate content modules
  2. Categorise content modules
    1. Unique identifier
    2. Subject
    3. Function
    4. Number of content items2
    5. Owner/Maintainer
    6. Current location
    7. Storage requirements (e.g. size on disk)
    8. Rate of content creation
    9. Condition of existing metadata
    10. Notes
  3. Content mapping
    1. By function
    2. By owner/maintainer
    3. By location
    4. By rate of creation
  4. Content survey summary
    1. Quantity of content
    2. Diversity of content
    3. Distribution of content
    4. Rate of creation of content

Detailed Content Audit

The third stage of the content inventory is onerous when compared to its predecessors; depending upon the objectives of the project this sub-process may be selectively applied to modules identified in the previous stage or omitted entirely. Tools may be useful, especially when auditing large numbers of content items.

The output of this stage is a database of categorised content items. Attributes recorded during the audit process will be selected according to the intended use of the content inventory. The design of the inventory database should accommodate all of the intended audiences for the inventory, e.g. content copyright information for Legal and content storage requirements for IT.

  1. Enumerate content items within each module
  2. Audit content items
    1. Unique identifier
    2. Title
    3. Description
      1. Textual description
      2. Content specific metadata (e.g. category or keywords)
      3. Audience
    4. Sensitivity (e.g. Confidential or Public domain)
    5. Function
    6. Author
    7. Owner/Maintainer
    8. Creation date
    9. Last modification date
    10. Modification frequency
    11. Copyright information
    12. Document status (e.g. Draft or Final)
    13. ROT (the content is redundant, out of date or trivial)
    14. Document type
      1. Type specific metadata (e.g. number of pages for a word processor document or colour depth for an image)
    15. Links to other content items
    16. Storage requirements
    17. Current location
    18. Notes

Analysis

Content inventory analysis is outside the subject area of this document. The deliverables of the content inventory project are likely to be driven by the input requirements of a subsequent analysis project.

Tools

Tools may help to streamline four types of activity within a content inventory project. Listed below are the four areas and some examples of the tools that may be useful in each area. Unfortunately tools cannot usually replace the knowledge of individuals who are familiar with the content being inventoried or the experience of information architects.

  1. Information management
    1. Spreadsheet applications
    2. Databases
  2. Visualisation
    1. Spreadsheet applications
    2. Graphing packages
  3. Automated categorisation
    1. Metadata extraction tools
    2. Usage analysis tools
  4. Coverage tools
    1. Crawler applications

Examples

Audit of the accessibility of content published on the Internet to individuals with disabilities to ensure and demonstrate compliance with the Disability Discrimination Act.

This example illustrates how the process may be designed to create an inventory for a narrowly focused audit of content. A complete content audit is not required because the information required to satisfy the auditor that the condition of accessible to disabled people is available for some content modules.

Content Module Unique Identifier Additional Attributes Publicly Accessible Accessible to Disabled People
Product Information M00001 - Yes No
Web Site Legal Statements M00002 - Yes Yes
Brand Marketing Materials M00003 - Yes Unknown

An audit of modules M00001 (Product Information) and M00003 (Brand Marketing Materials) would be required to satisfy the objective of the project.

Analysis of organisation-wide content management requirements for the next five years to inform the selection of a comprehensive content management strategy and supporting infrastructure requirements.

The objective stated in this example is broad enough to warrant describing the expected deliverables in more detail. Listed below are some examples of useful deliverables for this project.

  1. A summary of projected storage requirements for the management of content within the organisation.
  2. Rights management requirements for content within the organisation.
  3. A mapping of content by content maintainer and maintenance tasks.
  4. A mapping of content by frequency of access and modification.
  5. A mapping of content by ROT state.
  6. A summary of the condition of content metadata and implications on the effectiveness of content use.

The content inventory created must provide all of the information required to perform each of the deliverable analyses. The omission of a required attribute at the content inventory stage may lengthen the schedule and increase the cost of the project.

References

  1. Taking a Content Inventory, Janice Crotty Fraser
  2. Doing a Content Inventory (Or, A Mind-Numbingly Detailed Odyssey Through Your Web Site), Jeffrey Veen
  3. http://www.disability.gov.uk/