Metadata: A Primer
Useful information for searching, authenticating, and tracking data.
Although metadata exists for all types of information, it is especially important for digital information, where it can make large volumes of data much easier to use. This is particularly true in litigation, because metadata is the core of most digital forensics and e-discovery processes.
There are infinite types of metadata. Some are very common and useful in nearly every case; some are obscure but critical in certain circumstances. The types of metadata can be divided into three main categories.
Descriptive Metadata
This is the most commonly used metadata in litigation and includes Title, Author, Subject, Date and similar information about the data or document. This metadata is convenient because it frequently is human-readable (as opposed to some types of metadata which are only meaningful to a computer). There are standardized fields that are tracked for the most common types of data. For example, email metadata includes Date, Subject, Sender, and Recipient, although it may include many other potential pieces of metadata such as Bcc Recipients, or Read Date.
Descriptive metadata is essential for culling a large set of documents down to a manageable number for keyword searches or review, based on the relevant dates and custodians. Some descriptive metadata, such as hash values, can also be used to authenticate or compare documents.
Administrative Metadata
Administrative metadata includes information useful for management of data, such as file system details. A common type of administrative metadata is MAC time; this metadata shows when a file was modified, accessed, or created, and can be valuable information in an investigation. It reveals details about the history of a file, which can help establish or challenge authenticity or determine a certain user’s activities.
Structural Metadata
Structural metadata describes how discrete units of data relate to each other. Bates numbers are a common example found in litigation. These numbers don’t describe the data but instead express how the data relate to each other, such as the order of the pages and which documents are part of which production.