HTML4 has several mechanisms for describing meta data, the most well known being the <meta>
element. <meta>
is often used to denote the author of a document:
<meta name="author" lang="tr" content="Tantek Çelik" />
Briefly explained:
Each META element specifies a property/value pair. The name attribute identifies the property and the content attribute specifies the property's value.
A half dozen or so example properties which are fairly self explanatory: author, keywords, copyright, date, identifier. But in the definition of the 'name' attribute of the <meta>
element, the HTML4 specification takes great care to state that This specification does not list legal values for this attribute.
Instead, HTML4 provides a mechanism (the 'profile' attribute of the <head>
element) to point to a meta data profile that defines properties and values. And again, the specification states explicitly that it does not define formats for profiles
.
This proposal seeks to define a meta data profile format using principles of simplicity, reuse, and minimalism. The constraints, direction and hints for the format are derived from the HTML4 specification, and the building blocks of the format are taken from XHTML 1.0. Since the format is a subset of XHTML and therefore a profile itself, the format is called the XHTML Meta Data Profile, abbreviated as XMDP.
The format should be simple, both to write and to read. The format should be easy for humans to author by hand.
The format should attempt to reuse elements of already established formats whenever possible, and avoid inventing new terms if at all possible.
The format should be as small as possible, and no smaller. It should satisfy the constraints, direction, and needs implied by HTML4, and no more.
Strong emphasis added for clarity.
Authors may wish to define additional link types not described in this specification. If they do so, they should use a profile to cite the conventions used to define the link types. Please see the profile attribute of the HEAD element for more details.
profile %URI; #IMPLIED -- named dictionary of meta info --
This attribute specifies the location of one or more meta data profiles, separated by white space. For future extensions, user agents should consider the value to be a list even though this specification only considers the first URI to be significant. Profiles are discussed below in the section on meta data.
This specification does not define a set of legal meta data properties. The meaning of a property and the set of legal values for that property should be defined in a reference lexicon called a profile. For example, a profile designed to help search engines index documents might define properties such as "author", "copyright", "keywords", etc.
Referring to a profile where the property and its legal values are defined.
User agents may dereference the URI and perform some activity based on the actual definitions within the profile (e.g., authorize the usage of the profile within the current HTML document). This specification does not define formats for profiles.
<dl>
.
<dl>
of values for each property.<dl>
. A reasonable way to "call" that lexicon a "profile" is to annotate the respective <dl>
with a 'class
' attribute with the value "profile".
<dt>
would contain the property names, their respective definitions <dd>
should contain a description of the property as well as the aforementioned nested definition list of their valid values. The nested definition list's definition terms <dt>
should then contain the value names, and their respective definitions <dd>
should contain their respective descriptions.
The XHTML profile format consists of a definition list of properties as definition terms, and as their definitions, an optional brief description, and then, if applicable, one or more definition list(s) of values.
First the profile definition list, recognizable by its class:
<dl class="profile">
Note that the HTML4 'class' attribute is a space separated set of values. All that is required is for the value "profile" to be in that set.
Next a definition term and definition for a property:
<dt id='property1'>property1</dt>
<dd>
The property name is given an 'id' attribute so pages can reference the property in particular with a URL with the appropriate fragment identifier. The 'id' attribute need not be the same as the name of the property, but probably should be for the sake of simplicity.
Any amount of valid optional markup (except for definition lists of course) may be used to provide a prose description and/or references for the property.
<p>Authors may use property1 to describe
some particular details.
</p>
One or more nested definition list(s) for the values and their definitions. If the values do not form a discrete set, or if that set should be too large to practically enumerate, a simple prose description of the set of legal values and any type constraints will suffice.
<dl>
<dt id='value1'>value1</dt>
<dd>definition of value1</dd>
<dt id='value2'>value2</dt>
<dd>definition of value2</dd>
...
</dl>
...
</dd>
Again 'id' attributes are used so pages can reference a specific value using a URL to the profile with the fragment identifier for the value. And again the 'id' attribute need not be the same as the name of the value.
Perhaps another property:
<dt id='property2'>property2</dt>
And its values description instead:
<dd>
Property2 contains a space separated set of values,
each of which is a date in the ISO8601 date format.
</dd>
Etc., and finally closure of the outer definition list:
...
</dl>
The format may be embedded anywhere an HTML4 definition list may be embedded. Being well formed XML, the profile format may also be embedded in any XML document that permits embedding of XHTML.
A self-standing profile document can be simply constructed by wrapping the profile format with the minimal XHTML necessary for a valid XHTML document.
XMDP profile documents are typically HTML or XHTML documents (or both), and should be sent
with the respective MIME type, i.e. 'text/html
' for HTML or Compatible XHTML 1.0, or 'application/xhtml+xml
' for XHTML.
The various meta properties used informatively in HTML4 could be defined by the following profile document (also available online: samplehtmlprofile.html):
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head><title>sample HTML profile</title></head>
<body>
<dl class="profile">
<dt id='author'>author</dt>
<dd>A person who wrote (at least part of) the document.</dd>
<dt id='keywords'>keywords</dt>
<dd>A comma and/or space separated list of the
keywords or keyphrases of the document.</dd>
<dt id='copyright'>copyright</dt>
<dd>The name (or names) of the copyright holder(s)
for this document, and/or a complete statement of copyright.</dd>
<dt id='date'>date</dt>
<dd>The last updated date of the document, in ISO8601 date format.</dd>
<dt id='identifier'>identifier</dt>
<dd>The normative URI for the document.</dd>
<dt id='rel'>rel</dt>
<dd>
<dl>
<dt id='script'>script</dt>
<dd>A reference to a client-side script. When used with the
LINK element, the script is evaluated as the document loads and
may modify the contents of the document dynamically.</dd>
</dl>
</dd>
</dl>
</body>
</html>
For document authors, HTML4.01 describes the 'profile' attribute for referring to profiles. In short, to refer to a profile from any (X)HTML document, simply add a 'profile' attribute to the document's head element,
<head>
e.g. to reference the above samplehtmlprofile.html profile:
<head profile='http://gmpg.org/xmdp/samplehtmlprofile.html'>
Similarly, tools can load and cache one or more XMDP profiles by reading the profile attribute, treating it as a space separated set of URIs (the above example demonstrates the simple case of one profile, while the profile attribute may reference several), retrieving the profiles addressed by those URIs, and constructing a dictionary of properties and values by parsing the definition lists <dl>, terms <dt>, and definitions <dd> as specified in the XMDP Format Description above.
HTML4.01 states that one or more meta data profiles, [are] separated by white space
. The term "white space" is used interchangably for "white space characters" in HTML4.01. HTML4.01 defines white space in terms of a set of characters. Note that HTML4.01 contains a few other space separated attributes, such as the 'class' attribute, and the 'rel' attribute, and thus, the treatment of space-separated values is fairly well understood. To keep it simple, authors should use a single space character to delimit more than one profile URI in the 'profile' attribute, e.g.:
<head profile='http://example.org/p1 http://example.org/p2'>
Tools, however, should expect any amount/sequence of white space between URIs, where such white space consists of one or more occurances of the white space characters as defined in HTML4.01.
HTML 4.01 says that this specification only considers the first URI to be significant
. Obviously to reference and use multiple profiles, this portion of the spec must be extended just slightly to allow all URIs in the 'profile' attribute to have some meaning.
However, clearly HTML4.01 shows a bias towards the first URI rather than later URIs. Thus, consistent with that bias, the URIs in the 'profile' attribute are to be treated most significant (first) to least significant (last). Such relative significance only makes a difference when profiles attempt to define the same property(s) and/or value(s). Thus if two or more profiles define the same term, the earliest (first) of those profiles wins, and its definition for that term is used.