![]() Capt. Horatio T.P. Webb | The DTD Parks -- FALL 2000 Last Updated 11AM 8/29/2000 |
An XML file contains nested tags (the markup) and the content (the data between the tags). However, an xml file may also contain an optional DTD (Data Type Definition) to specify specific requirements for the tags and the XML structure. When using a DTD, the definitions appear before the content, like:
<?xml version="1.0"?>
<!DOCTYPE name-of-document-type [
.
.
. Element definitions fit inside these square brackets for this type of document
.
.
]>
Following the DTD are the XML tags and data. The first and last tags are:
<name-of-document-type>.
.
. the detailed XML tags and data are contained within this outside tag pair
.
</name-of-document-type>
These two "outside" tags thus contain the whole collection of XML tags and data just like the <BODY></BODY> tag pairs.
Tag names ARE CASE SENSITIVE. <FRED> and <fred> are different tags.
The nested element definitions then appear in order between the square brackets of the DTD. For each element of the name-of-document-type, its definition in the DTD will appear as:
<!ELEMENT name-of-element ( list-of-elements or data definition) >
The list-of-elements names the elements (i.e., tags) and specifies the element requirements by following the ELEMENT name with either:
For example, a purchase-order might contain:
The DTD would then begin like this:
<?xml version="1.0"?>
<!DOCTYPE purchase-order [
<!ELEMENT purchase-order (buyer-name, address+, city, state, zip, order-line+)
.
.
.
]
Thus the purchase-order document has as its root a purchase-order. The purchase-order has:
a buyer-name pair, one or more address tag pairs, one each city, state and zip tags, and one or more order-line tags. Each order-line has a set of product-code, quantity and price tags.
(#PCDATA) means that the content of the ELEMENT (i.e., the value between the tag pairs) is parsed character data . PCDATA cannot contain the characters "<", ">" or "&". To include these characters as data use "&lt;" for <, "&gt;" for >, and "&amp" for &. You can also specify data as CDATA which is unparsed character data where the characters <, >, and & are allowed.
The full DTD would be:
Then adding some sample content data, the full xml file would be:
If we wished to have multiple purchase orders, we could simply modify the DTD to be:
Now a stack-of-purchase-orders is just one or more (+) purchase-order tag pairs.
Any ELEMENT may also have Attributes. These are parameters that describe the ELEMENT and are optional. Many HTML tags use attributes. The <TABLE> tag, for example can have attributes like: BGCOLOR, LINK, VLINK, ALINK, etc.. These tags are defined inside the beginning tag. The general format for the attribute tag in the DTD is:
#REQUIRED means the attribute must always be present
For example, if the product element always needs a buyer-size attribute of either "S", "M", "L", or "XL" the product ELEMENT in the DTD would be:
<!ELEMENT product (#PCDATA)>
A typical content tag for product might be:
<product buyer-size="M">gloves</product>
You can use the w3c's XML validator to check your XML syntax at:
http://www.ltg.ed.ac.uk/~richard/xml-check.html
Be sure to check "validate" on this page. Several other XML validators are also shown.
<?xml version="1.0"?>
<!DOCTYPE purchase-order [
<!ELEMENT purchase-order (buyer-name, address+, city, state, zip, order-line+)
<!ELEMENT buyer-name (#PCDATA) >
<!ELEMENT address (#PCDATA) >
<!ELEMENT city (#PCDATA) >
<!ELEMENT state (#PCDATA) >
<!ELEMENT zip (#PCDATA) >
<!ELEMENT order-line ( product, quantity, price) >
<!ELEMENT product (#PCDATA) >
<!ELEMENT quantity (#PCDATA) >
<!ELEMENT price (#PCDATA) >
]
<?xml version="1.0"?>
<!DOCTYPE purchase-order [
<!ELEMENT purchase-order (buyer-name, address+, city, state, zip, order-line+) >
<!ELEMENT buyer-name (#PCDATA) >
<!ELEMENT address (#PCDATA) >
<!ELEMENT city (#PCDATA) >
<!ELEMENT state (#PCDATA) >
<!ELEMENT zip (#PCDATA) >
<!ELEMENT order-line ( product, quantity, price) >
<!ELEMENT product (#PCDATA) >
<!ELEMENT quantity (#PCDATA) >
<!ELEMENT price (#PCDATA) >
]
>
<purchase-order>
<buyer-name>Michael S. Parks</buyer-name>
<address>4099 Bayview Street</address>
<address>Apartment 5</address>
<city>Houston.</city>
<state>TX</state>
<zip>77001</zip>
<order-line>
<product>Wool Sweater </product >
<quantity>2</quantity>
<price>49.95 </price >
</order-line>
<order-line>
<product>Gloves</product >
<quantity>1</quantity>
<price>19.95 </price >
</order-line>
</purchase-order>
<!DOCTYPE stack-of-purchase-orders [
<!ELEMENT stack-of-purchase-orders (purchase-order+) >
<!ELEMENT purchase-order (buyer-name, address+, city, state, zip, order-line+) >
<!ELEMENT buyer-name (#PCDATA) >
<!ELEMENT address (#PCDATA) >
<!ELEMENT city (#PCDATA) >
<!ELEMENT state (#PCDATA) >
<!ELEMENT zip (#PCDATA) >
<!ELEMENT order-line ( product, quantity, price) >
<!ELEMENT product (#PCDATA) >
<!ELEMENT quantity (#PCDATA) >
<!ELEMENT price (#PCDATA) >
]
<!ATTLIST name-of-element name-of-attribute CDATA
or
( list-of-attribute-values
separated by |'s)
#REQUIRED
#IMPLIED
#FIXED"default value" >
#IMPLIED means that the attribute has no default value and is NOT required
#FIXED means the default value cannot be replaced
<!ATTLIST product buyer-size ( S | M | L | XL ) #REQUIRED >