10 XMLSCHEMA (PART II)

Dr M. Vijayalakshmi

XML Data Components

 

An XML file includes two kinds of data items namely Elements and Attributes.

 

Elements:

 

<article mdate=”2002-01-03” …>

<e ditor>XXX</editor>

</article>

 

The elements are identified to have an hierarchical structure with an open tag and close tag pairs. An element may include nested elements. An element may also include attributes within the element’s open tag. Multiple elements may have same name but the order by which the element appears within the file matters.

 

 

Attributes:

 

mdate=”2002-01-03

     Attributes are named values and are not hierarchical. Only one attribute with a given name can appear within a element and the order of the attributes within the element does not matter.

     “Simple” and “complex” elements

 

A “simple” element is one that contains text and nothing else. A simple element cannot have attributes and it cannot contain other elements. Further, a simple element cannot be empty.

However, the text can be of many different types, and may have various restrictions applied to it. If an element isn’t simple, then it is called as a “complex” element. A complex element may have attributes. A complex element may be empty, or it may contain text, other elements, or both text and other elements.

 

Simple element

 

A simple element is defined as

<xs:element        name=”name”         type=”type” />

where,

  • name is the name of the element
  • the most common values for type are

xs:boolean                       xs:integer

xs:date                               xs:string

xs:decimal                        xs:time

 

Other attributes a simple element may have are, “default” and “fixed”.

  • default= “default valueis used if no other value is specified
  •  fixed= “valueis used if no other value may be specified

    Defining an attribute

 

Attributes themselves are always declared as simple types.

An attribute is defined as

<xs:attribute         name=”name”        type=”type” />

where:

  • name and type are the same as for xs:element

     Other attributes a simple attribute may have are “default”, “fixed” and “use”. “default valueis used if no other value is specified

  •  fixed = “value“  is used if no other value may be specified
  •  use = “optional”specifies the attribute is not required (default)
  •  use = “required”specifies the attribute must be present

    Restrictions, or “facets”

 

Restrictions are used to define acceptable values for XML elements or attributes. Restrictions on XML elements are called facets. The restriction element defines restrictions on a simpleType, simpleContent, or complexContent definition.

 

The general form for putting a restriction on a text value is by the use of the defined schema element called <xs:restriction> </xs:restriction>.

The syntax of defining it is given below,

 

<xs:element  name=”name“>            or xs:attribute )

<xs:restriction base=”type“>

… the restrictions …

</xs:restriction>

</xs:element>

 

An example:

 

<xs:element  name=”age”>

<xs:restriction base=”xs:integer”>

<xs:minInclusive value=”0″>

<xs:maxInclusive value=”140″>

</xs:restriction>

</xs:element>

 

Restrictions on numbers

 

The following are some of the restrictions that can be applied on numbers.

  •   minInclusive – is used when the number must be ≥ the given value
  •   minExclusive — is used when the number must be > the given value
  •   maxInclusive — is used when the number must be ≤ the given value
  •   maxExclusive — is used when the number must be < the given value
  •   totalDigits — is used when the number must have exactly value digits
  • fractionDigits — is used when the number must have no more than value digits after the decimal point.

     Restrictions on strings

 

The following are the restrictions that can be used with strings.

  •   length — is used when the string must contain exactly value characters
  •  minLength — is used when the string must contain at least value characters
  •  maxLength — is used when the string must contain no more than value characters
  •  pattern — is used when the value is a regular expression that the string must match
  •  whiteSpace – is not really a “restriction” but it tells what to do with whitespace.
  •  The value = “preserve” denotes to keep all whitespaces.
  •  The value = “replace” denotes to change all whitespace characters to spaces
  •  The value = “collapse” denotes to remove leading and trailing whitespace, and replace all sequences of whitespace with a single space.

    Enumeration

 

An enumeration constraint restricts the value to be one of a fixed set of values that normally limits the content of an XML element.

 

Example:

 

<xs:element name=”season”>

<xs:simpleType>

<xs:restriction  base=”xs:string”>

<xs:enumeration value=”Spring”/>

<xs:enumeration value=”Summer”/>

<xs:enumeration value=”Autumn”/>

<xs:enumeration value=”Fall”/>

<xs:enumeration value=”Winter”/>

</xs:restriction>

</xs:simpleType>

</xs:element>

 

In this example, we restrict the element ‘season’ to take any of these values as ‘spring’, ‘summer’, ‘Autumn’, ‘Fall’ and ‘Winter’.

    Complex elements

 

A complex element is defined with the schema element <complexType> </complexType>. The syntax is given below,

 

<xs:element        name=”name“>

<xs:com plexType>

… information about the complex type…

</xs:com plexType>

</xs:element>

 

Example:

 

In this example, we have defined a complex element ‘person’ to have two simple elements called ‘firstName’ and ‘lastName’.

 

<xs:element        name=”person”>

<xs:com plexType>

<xs:sequence>

<xs:element name=”firstName” type=”xs:string” /> <xs:element name=”lastName” type=”xs:string” />

</xs:sequence>

</xs:com plexType>

</xs:element>

 

The element <xs:sequence> says that elements must occur in this order.

Note: Remember that attributes are always simple types.

 

Global and local definitions

 

The elements that are declared at the “top level” of a <schema> are available for use throughout the schema. They are considered as global definitions.

 

The elements declared within a <xs:complexType> are local to that type and are considered as local definitions.

 

Thus, in the definition of the element ‘person’ the elements ‘firstName’ and ‘lastName’ are only locally declared.

 

<xs:element        name=”person”>

<xs:com plexType>

<xs:sequence>

<xs:element  name=”firstName”  type=”xs:string” />

<xs:element name=”lastName” type=”xs:string” /> </xs:sequence>

</xs:com plexType>

</xs:element>

 

But the order of declarations at the “top level” of a <schema> does not specify the order in the XML data document.

 

Declaration and Use

 

So far we have been talking about how to declare types but not about how to use them.

 

To use a type we have declared, use it as the value of type=”…”. Here we should note that the type specifies the user defined type we have already defined and not the predefined types.

 

Examples:

 

<xs:element name=”student”  type=”person”/>

<xs:element name=”professor” type=”person”/>

 

The example says that the element ‘student’ is defined to be ‘person’ type which we have already defined.

Note: Here the Scope is important: We cannot use a type if it is local to some other type.

 

Model Groups

 

Model Groups are used to define an element that has mixed content of elements and text mixed or element content. Model Groups can be defined with the keywords such as ‘all’, ‘choice’ and ‘sequence’. all – is used when the elements specified must all be there, but it can appear in any order.

 

choice – is used when any of the elements specified may or may not be there

Sequence – is used when all of the elements specified must appear in the specified order

 

xs:sequence

 

We have already seen an example of a complex type whose elements must occur in a specific order.Here in this example we have used the model group ‘sequence’ which specifies that within the ‘person’ element ‘firstName’ and ‘lastName’ should appear in the specified order.

 

   <xs:element        name=”person”>

<xs:com plexType>

<xs:sequence>

<xs:element name=”firstName” type=”xs:string” /> <xs:element name=”lastName” type=”xs:string” />

</xs:sequence>

</xs:com plexType>

</xs:element>

 

xs:all

 

xs:all allows elements to appear in any order but all elements should appear within the element.

<xs:element        name=”person”>

<xs:com plexType>

<xs:all>

<xs:element name=”firstName” type=”xs:string” /> <xs:element name=”lastName” type=”xs:string” />

</xs:all>

</xs:com plexType>

</xs:element>

Despite the name, the members of an xs:all group can occur once or not at all.

We can use minOccurs=”0″ to specify that an element is optional (default value is 1). In this context, maxOccurs is always 1.

 

Referencing

 

Once you have defined an element or attribute (with name=”…”), we can refer to it with the keyword ref=”…”

Below is an example to explain this.

 

<xs:element        name=”person”>

<xs:com plexType>

<xs:all>

<xs:element name=”firstName” type=”xs:string” /> <xs:element name=”lastName” type=”xs:string” />

</xs:all>

</xs:com plexType>

</xs:element>

<xs:element  name=”student”  ref=”person”>

Or just:      <xs:element  ref=”person”>

 

In this example, a complex type <person> has been created and it had been referred as the type for the <student> element when it is defined.

 

Text element with attributes

 

If a text element has attributes, it is no longer considered as a simple type but as an element. Here, we make use of the schema element <simpleContent> to define a text element that has attributes.

 

<xs:element  name=”population”>

<xs:com plexType>

<xs:simpleContent>

<xs:extension  base=”xs:integer”>

<xs:attribute  name=”year”

type=”xs:integer”>

</xs:extension>

</xs:simpleContent>

</xs:com plexType>

</xs:element>

 

In the above syntax, the element ‘population’ has an attribute ‘name’ which is defined using a schema element called <simpleContent>.

 

Empty elements

 

Empty elements are defined as complex elements. An example of defining an empty element is given below.

 

<xs:com plexType  name=”counter”>

<xs:com plexContent>

<xs:extension  base=”xs:anyType”/>

<xs:attribute  name=”count”  type=”xs:integer”/>

</xs:com plexContent>

</xs:com plexType>

   Here, we define the empty element ‘counter’ with the schema element <complexContent>.

 

Mixed elements

 

Mixed elements may contain both text and elements. To define a mixed element, we add the attribute mixed=”true” to the <xs:complexType> element. The text itself is not mentioned in the element, and may go anywhere and it is basically ignored.

 

<xs:com plexType name=”paragraph” mixed=”true”> <xs:sequence>

<xs:element  name=”someName” type=”xs:anyType”/>

</xs:sequence>

</xs:com plexType>

 

Here, the element ‘paragraph’ has been defined as a complex type that is composed of another element.

 

Extensions

 

We can base a complex type on another complex type. The complex type is an extension of another complex type with the use of the attribute base = “otherType”.

 

<xs:com plexType  name=”newType”>

<xs:com plexContent>

<xs:extension  base=”otherType”>

…new stuff…

</xs:extension>

</xs:com plexContent>

</xs:com plexType>

 

Type Extensions

 

A third way of creating a complex type is to extend another complex type like inheritance in Object Oriented concepts.

 

In this example, we define the element ‘Employee’ as of type ‘PersonInfoType’. The name of the employee is defined as a complexType which extends from ‘PersonNameType’.

 

<xs:element name=”Em ployee” type=”PersonInfoType”/> <xs:com plexType name=”PersonNameType”>

<xs:sequence>

<xs:element name=”FirstName” type=”xs:string”/> <xs:element name=”LastName” type=”xs:string”/>

</xs:sequence>

</xs:com plexType>

<xs:com plexType name=”PersonInfoType”>

<xs:com plexContent>

<xs:extension base=”PersonNameType”>

<xs:sequence>

<xs:element name=”Address” type=”xs:string”/>

<xs:element name=”City” type=”xs:string”/>

<xs:element name=”Country” type=”xs:string”/>

</xs:sequence>

</xs:extension>

</xs:com plexContent>

</xs:com plexType>

 

Type Extensions (use)

 

The below syntax demonstrates the use of a type that is an extension of another. It is used as though it were all defined in a single type.

 

<Employee xmlns:xsi=”http://www.w3.org/2001/XMLSchema -instance”

xsi:noNamespaceSchemaLocation=”TypeExtension.xsd”>

<FirstName>King</FirstName>

<LastName>Arthur</LastName>

<Address>Round Table</Address>

<City>Camelot</City>

<Country>England</Country>

</Employee>

    Simple Content in Complex Type

 

If a type contains only simple content (text and attributes), a <simpleContent> element can be put inside

the <complexType>.

 

A <simpleContent> must have either a <extension> or a <restriction>.

 

<xs:element name=”dialog”>

<xs:com plexType>

<xs:simpleContent>

<xs:extension base=”xs:string”>

<xs:attribute name=”speaker” type=”xs:string” use=”required”/> </xs:extension>

</xs:simpleContent>

</xs:com plexType>

</xs:element>

 

Predefined string types:

 

Recall that a simple element is defined as:

 

<xs:element  name=”name”  type=”type” />

Here are a few of the possible string types that are given below.

  •   xs:string — a string
  •  xs:normalizedString — a string that doesn’t contain tabs, newlines, or carriage returns
  •  xs:token — a string that doesn’t contain any whitespace other than single spaces

   Allowable restrictions on strings are, enumeration, length, maxLength, minLength, pattern and whiteSpace.

 

Predefined date and time types:

 

•  xs:date — A date in the format CCYY-MM-DD, for example, 2002-11-05

•  xs:time – A time in the format hh:mm:ss (hours, minutes, seconds)

•  xs:dateTime — Format is CCYY-MM-DDThh:mm:ss

–   The T is part of the syntax.

 

Allowable restrictions on dates and times are, enumeration, minInclusive, minExclusive, maxInclusive, maxExclusive, pattern and whiteSpace.

   Predefined numeric types:

   Some of the predefined numeric types are given below,

xs:decimal xs:positiveInteger
xs:byte xs:negativeInteger
xs:short xs:nonPositiveInteger
xs:int xs:nonNegativeInteger
xs:long

 

Allowable restrictions on numeric types are enumeration, minInclusive, minExclusive, maxInclusive, maxExclusive, fractionDigits, totalDigits, pattern and whiteSpace.

 

Types

 

Both elements and attributes have types, which are defined in the Schema. We can reuse types by giving them names.

 

For example, we define a schema Addr.xsd

 

Addr.xsd:

<xsd:element name=”Address”>

<xsd:com plexType>

<xsd:sequence>

<xsd:element name=”Street” type=”xsd:string”/> <xsd:element name=”Apartment” type=”xsd:string”/> <xsd:element name=”Zip” type=”xsd:string”/>

</xsd:sequence>

</xsd:com plexType>

</xsd:element>

 

(or)

 

<xsd:com plexType name=”AddrType”>

<xsd:sequence>

<xsd:element name=”Street” type=”xsd:string”/> <xsd:element name=”Apartment” type=”xsd:string”/> <xsd:element name=”Zip” type=”xsd:string”/>

</xsd:sequence>

</xsd:com plexType>

<xsd:element name=“ShipAddress” type=”AddrType “/> <xsd:element name=“BillAddress” type=”AddrType”/>

   Complex Type Definition

    Composition

 

Model Group is composed of Compositor such as sequence or choice or all and Particles, which can be Element Declaration or a Wildcard or a Model Group.

The model group schema component has the following properties:

{compositor}

One of all, choice or sequence.

{particles}

A list of particles

{annotation}

Optional. An annotation.

 

Annotations

 

Annotations are provided for human-targeted documentations and machine-targeted documentations of schema components. They are actually meta-data.

 

The annotation schema component has the following properties:

{application information}

A sequence of element information items.

{user information}

A sequence of element information items.

{attributes}

A sequence of attribute information items.

 

The XML representation for an annotation schema component is an <annotation> element information item.

 

<xs:simpleType fn:note=”special”>

<xs:annotation>

<xs:documentation>A type for experts only</xs:documentation> <xs:appinfo>

<fn:specialHandling>checkForPrimes</fn:specialHandling>

</xs:appinfo>

</xs:annotation>

    Declarations

 

Example

 

<xs:element name=”PurchaseOrder” type=”PurchaseOrderType”/> <xs:element name=”gift”>

<xs:com plexType>

<xs:sequence>

<xs:element name=”birthday” type=”xs:date”/>

<xs:element ref=”PurchaseOrder”/>

</xs:sequence>

</xs:com plexType>

</xs:element>

<xs:attribute name=”age” type=”xs:positiveInteger” use=”required”/> <xs:notation name=”jpeg” public=”image/jpeg” system=”viewer.exe”>

 

The example shows the declaration of the element ‘PurchaseOrder’ which is defined as ‘PurchaseOrderType’ and is composed of complex element ‘gift’, attribute ‘age’ and notation ‘jpeg’.

 

Example Schema

 

An example to illustrate the concepts we have seen so far is given below.

 

<?xml version=„1.0“ ?>

<xsd:schema xmlns:xsd= http://w3.org/2001/XMLSchema“> <xsd:element name=book“ type=BookType“/>

<xsd:com plexType name= BookT ype“>

<xsd:sequence>

<xsd:element name= title“ type= xsd:string“/> <xsd:element name= author“ type= PersonType“

minOccurs=1“ maxOccurs= unbounded“/>

<xsd:com plexType name= PersonType“>

<xsd:sequence> … <xsd:sequence>

</xsd:com plexType>

<xsd:element name= publisher“ type= xsd:anyType“/> </xsd:sequence>

</xsd:com plexType>

</xsd:schema>

   Summary

 

This module has explained about defining simple elements and complex elements. The module also explains that there exists variety of data “facets” and “patterns” to impose domain constraints. The elements can be type defined by applying inheritance. Moreover, the module explains about defining advanced constraints such as “primary key” and “referential integrity”.

 

Web Links

  • www.w3schools.com/xml
  • https://www.w3.org/TR/xmlschema-1/#components
  • cs.lamar.edu/faculty/osborne/5369/sp04/XMLschema.ppt
  • www.tutorialspoint.com/xml
  • people.cs.uchicago.edu/~asiegel/courses/cspp51038…/XMLSchema.ppt