9 XMLSCHEMA(PART I)

Dr M. Vijayalakshmi

XML Schemas

“Schemas” is a general term used in many context and DTD’s are a form of XML schemas. According to the dictionary, a schema is “a structured framework or plan”. When we say “XML Schemas,” we usually mean the W3C XML Schema Language. This is also known as “XML Schema Definition” language, or XSD.An XML Schema consists of components such as type definitions and element declarations that can be used to assess the validity of well-formed element and attribute information items in the file.

Why we need XML Schemas?

Some of the reasons why we need XML schemas are listed below,DTDs provide a very weak specification language. We can’t put any restrictions on text content and we have very little control over mixed content (text plus elements). Moreover,we have little control over ordering of elements.

DTDs are written in a strange (non-XML) format and so we need separate parsers for DTDs and XML . The XML Schema Definition (XSD) language solves these problems by giving us much more control over structure and content. XSD is written in XML.

Why not XML schemas?

Even though XML schemas supports many features that are not in DTD, DTD’s are widely used than Schemas. DTD’s have been around longer than XSD and therefore they are more widely used. Also, more tools support them. XSD is very verbose, even by XML standards. More advanced XML Schema instructions can be non-intuitive and confusing. Nevertheless, XSD is not likely to go away quickly for the reasons what has been explained earlier.

Referring to a schema

To refer to a DTD in an XML document, the reference goes before the root element as given below,

<?xml version=”1.0″?>

<!DOCTYPE rootElement SYSTEM “url”>

To refer to an XML Schema in an XML document, the reference goes in the root element as shown below,

<?xml version=”1.0″?>

(This is where the XML Schema definition can be found) …

</rootElement>

The XSD document

The XSD is written in XML and the file extension is .xsd. The root element is <schema>.

The XSD starts like this:

<?xml version=”1.0″?>

<xs:schema xmlns:xs=”http://www.w3.rg/2001/XMLSchema”>

<schema>

The root element <schema> may have attributes, one of which is including the namespace which is necessary to use the predefined schema elements defined in the namespace.

xmlns:xs= “http://www.w3.org/2001/XMLSchema“

This statement is necessary to specify where all our XSD tags are defined.

elementFormDefault=”qualified”

This means that all XML elements must be qualified which means it uses a namespace.It is highly desirable to qualify all elements, or problems will arise when another schema is added.

Sample Schema header

The <schema> element may contain some more attributes. A schema declaration often looks something like this:

<?xml version=”1.0″?>

<xs:schema xmlns:xs=”http://www.w3.org/2001/XMLSchema”

targetNamespace=”http://www.w3schools.com” xmlns=”http://www.w3schools.com”

elementFormDefault=”qualified”>

… …

</xs:schema>

The following fragment,

– xmlns:xs=http://www.w3.org/2001/XMLSchema

indicates that the elements and data types used in the schema (i.e., schema, element, complexType, sequence, string, boolean, etc.) come from the “http://www.w3.org/2001/XMLSchema” namespace. It also specifies that the elements and data types that come from the “http://www.w3.org/2001/XMLSchema” namespace should be prefixed with ‘xs:’.

The following fragment, targetNamespace=http://www.w3schools.com

indicates that the elements defined by this schema (note, to, from, heading, body.) come from the “http://www.w3schools.com” namespace.

The following fragment,

– xmlns=http:// www.w3sch ools.c om

indicates that the default namespace is “http://www.w3schools.com”.

The following fragment,

– elementFormDefault=”qualified“

indicates that any elements used by the XML instance document which were declared in this schema must be namespace qualified.

An XML Instance Document Example is given below,

<qualification> XXXXXX </qualification>

</book>

The Schema for the above XML file could be written as follows,

<?xml version=”1.0″ encoding=”utf-8″?>

<xs:schema xmlns:xs=”http://www.w3.org/2001/XMLSchema”> <xs:element name=”book”>

<xs:com plexType>

<xs:sequence>

<xs:element name=”title” type=”xs:string”/> <xs:element name=”author” type=”xs:string”/> <xs:element name=“qualification“ type=“xs:string”/>

</xs:sequence>

</xs:com plexType>

</xs:element>

</xs:schema>

Let us learn about how to write XML schemas in the forth coming sections.

What’s in a Schema?

A Schema is an XML document but a DTD is not an XML document. Because a schema is an XML document, it must have a root element and the root element is <schema> for any XSD file. Within the root element, there can be any number and combination of Inclusions, Imports, Re-definitions and Annotations. This is followed by any number and combinations of Simple and complex data type definitions, Element and attribute definitions, Model group definitions and Annotations.

Structure of a Schema

<schema>

<!– any number of the following –>

<!– any number of following definitions –>

</schema>

XML Schema Components

Abstract Data Model
Simple and Complex Type Definitions
Declarations
Relationship among Schema Components

W3C defined an abstract data model to describe XML schemas, and it is composed of schema components. we can write XML schemas using the components as building blocks.

XML Abstract Data Model

The XML Abstract Data Model composes of Schema Components. This abstract model is used to describe XML Schemas.

Schema Component

It is the generic term used for the building blocks that compose the abstract data model of the schema. There are 13(thirteen) kinds of Schema Components defined in the Abstract model. Following is the list of all schema components.

They are classified into three groups, primary, secondary, and helper group. We will present how to compose schemas based on the components.

Primary Group

1. Simple type definitions

2. Complex type definitions

3. Attribute declarations

4. Element declarations

Secondary Group

5. Attribute group definitions

6. Identity-constraint definitions

7. Model group definitions

8. Notation declarations

Helper Group

9. Annotations

10. Model groups

11. Particles

12. Wildcards

13. Attribute Uses

XML document & XML Schema

Figure 9.1 describes about the differences between the XML document and an XML schema created for the XML file. An XML document is saved as .xml file and it contains the description of the information items. An XML schema for an XML file is saved as .xsd file and it composes of the desriptions of the declaration and type definitions of the elements and the attributes.

There are two kinds of activity when writing schemas, declaration and definition.All items in XML documents must conform to the declaration in schemas and it must be validated according to the declarations in schemas. The most important declarations are element declarations and attribute declarations. Hence, it is required to declare elements and attributes in schemas. To declare them their types must be defined, just like to define classes in OOP.

Declaration & Definition

Declaration Components are associated by (qualified) names to information items being validated. It is like declaring objects in OOP.

Definition Components define internal schema components that can be used in other schema components. Type definition is like defining classes in OOP.

Declarations

Declarations are done for the following information items.

• Element Declaration

• Attribute Declaration

• Notation Declaration

There are three kinds of declaration. Element, attribute and notation. All of them are components in the abstract data model. In attribute declaration, we can use Attribute uses to indicate whether the attribute is required, whether it’s value is fixed or it holds default value and soon. Attribute use is an optional component in the abstract data model.

An example of declaration of elements is given below. In this example the declaration for the element ‘book’ is shown. Here we use the <element> to declare the book and it is defined as a complex type.

<xs:element name=”book”>

<xs:com plexType>

<xs:sequence>

<xs:element name=”title” type=”xs:string”/>

… …

</xs:sequence>

<xs:attribute name=”isbn” type=”xs:string”/> </xs:com plexType>

</xs:element>

Type Definitions

By type definition, we can do type checking using applications, also we can reuse our type definitions.

There two kinds of type definition, simple and complex.

Simple Type Definition VS. Complex Type Definition

The Simple type definition can be used to declare attributes and elements without element children, while the Complex type definition can only be used to declare elements.Figure 9.2 pictorially represents the use of Simple and Complex type definitions.

<xs:complexType name=”bookType”>

<xs:sequence>

<xs:element name=”title” type=”nameType”/>

… …

</xs:sequence>

<xs:attribute name=”isbn” type=”isbnType” use=”required”/> </xs:complexType>

In this example, we define a complex data type named “bookType”, and then we can declare an element ‘book’, somewhere in the file, based on the bookType. Note here we use <complexType> to define the type. <xs:element name=”book” type=”bookType”/>.Here we declare an element ‘book’ based on the bookType.

Simple Types

What is an element with simple type?

A simple element is an XML element that can contain only text. It cannot contain any other elements or attributes. We can also add restrictions (facets) to a data type in order to limit its content, and we can require the data to match a defined pattern.

Simple Type Definition

Simple Type Definition can be a restriction of some other simple type. It can be a list or union of simple type definition or a built-in primitive datatypes.

Example

<xs:simpleType name=”farenheitWaterTemp”>

<xs:restriction base=”xs:number”>

<xs:fractionDigits value=”2″/>

<xs:minExclusive value=”0.00″/>

<xs:maxExclusive value=”100.00″/>

</xs:restriction>

</xs:simpleType>

Here is an example of simple type definition. We define a simple type to describe water temperature. We define it by restricting a base type number, which is a built-in type. To restrict the base type ,we use the restriction element, we can specify the base type and other restrictions, such as minimum and maximum values.

Complex Types

Complex types are used for describing complex elements.

Complex Elements

A complex element is an XML element that contains other elements and/or attributes.

There are four kinds of complex elements:

empty elements
elements that contain only other elements
elements that contain only text
elements that contain both other elements and text

Note: Each of these elements may (or must) contain attributes as well.

Examples of Complex XML Elements

Some examples of complex elements are given below.

A complex XML element, “product”, which is empty is given below.

A complex XML element, “employee”, which contains only other elements:

<lastname>Smith</lastname>

</employee>

A complex XML element, “food”, which contains only text:

<food type=”dessert”>Ice cream</food>

A complex XML element, “description”, which contains both elements and text:

<description> It happened on <date lang=”norwegian”>03.03.99</date>..</description>

Complex Type Definition

According to Object Oriented concepts, there are two kinds of mechanism by which we can define complex type, one by using inheritance property and the other by using composition property. As for inheritance, we can restrict or extend a base type. Composition is achieved by using the group model component in the abstract model.

Inheritance:

Restriction: We restrict base types definitions.

Extension: We add something new.

Composition:

We use Group model to deploy composition.

Each complex type definition is either a restriction of a complex type definition or an extension of a simple or complex type definition or a restriction of the ur-type definition.

Specifically, we can define a complex type by restricting another complex type, extending a simplex or complex type definition, or restricting a ur-type definition. A ur-type is a special type definition that can be as both simple and complex definition.

Example

<xs:complexType name=”personName”>

<xs:sequence>

<xs:element name=”title” minOccurs=”0″/>

… …

</xs:sequence>

</xs:com plexType>

<xs:complexType name=”extendedName”>

<xs:com plexContent>

<xs:extension base=”personName”>

<xs:sequence>

<xs:element name=”generation” minOccurs=”0″/> </xs:sequence>

</xs:extension>

</xs:com plexContent>

</xs:com plexType>

Since we have already seen an example of restriction in simple type definition. In this example we have illustrated the property extension with inheritance. Based on a complex type called personName, we define a new complex type by using extension element. Here we add a new element called generation.

Summary

This module explains about the fact that XML Schema is a flexible and powerful schema language and the syntax of the schema is XML itself. The module gives an idea about simple and complex elements and about simple and complex type definitions. The module also explores a variety of data types and ability to extend type system.

Web Links

www.w3schools.com/xml
cs.lamar.edu/faculty/osborne/5369/sp04/XMLschema.ppt
www.tutorialspoint.com/xml
www.cis.upenn.edu/~matuszek/cit597-2003/Lectures/30-schemas.ppt
https://www.w3.org/TR/xmlschema-1/#components