Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 30

Welcome to XML Session

Date of presentation : 16th September 2006

Primary Audience : Academy project folks.

Intended Audience : Developers who are beginning with XML and willing to reach to intermediary level
of expertise.

1
Take away from the presentation.

• Evolution of XML.
• Ability to create a welformed and valid XML.
• Work with XSLT and XPath.
• Work with Namspaces.
• Understand and Work with SAX and DOM parser.
• Understand and build Schemas.
• Patterns for XML schemas.

2
Introduction

• What is XML? (Def’n from gurus)

• Extensible markup language is the latest buzzword in the industry. It is rapidly


maturing technology with the powerful real-world applications, particularly for the
management, display and organisation of data. Together with is the display
techniques such as XSL and DOM. As of today XML is an essential technology for
anyone using markup languages on web or internally.

• What is XML? Is it a language?


• Not really. This is a meta language used to define languages.

• How do we understand XML.


• Parsers and transformers.

• How do business application start to use XML? Why only XML


• XML can be started to be used as an Data format. It is globally understandable and is
context independent.

• Life of XML.
• XML is here to stay. This is backed by the W3C World Wide Web Consortium.

3
Introduction Continued…

• What is an XML?
• Sample shown.

• What makes an XML?


• XML version. Simply 1.0
• XML schemas
• Namespaces.
• Xpath.
• XSLT
• Parsers.

• Practical Applications.
• Some kind of DTO. Client Interface development(XHTML/SVG), Process
Definitions, Deplyment descriptors, RPC evolved to SOAP.

4
Well-formed

• What makes an XML.


• Tags and Text (PCData mostly) and Elements.

• Rules for well-formed ness.


– Every Start-tag must have a matching end-tag.
– Tags must not overlap.
– XML documents have only one root element.
– Naming conventions must be followed.
– Case-Sensitive.
– Unlike HTML whitespaces are respected in XML.

• Attribute
• Attributes are simple name/value pairs associated with an element.

• Rules for valid attributes.


– Attributes must have values atleast a “ “.
– Naming conventions must be followed.

5
Well-formed continued…

• Why use Attributes why not elements?


• Attributes must provide metadata that may not be useful for most of the applications and also this metadata must
possibly not have life without the data.

• Comments.
• <!– do we need to do this everytime-->

• Empty elements.
• Does it make a lot of sense.

• XML declaration.
• Complete XML declaration is <?xml version=‘1.0’ encoding=‘UTF-8’ standalone=‘yes’?>

• What are processing Instructions?


• Example is <?myprocessor sleep for a while?>

• Escape charaters.
• This is required to make an xml well-formed. Believe me this is not as easy as it looks.

• CDATA SECTION.
• <![CDATA[This is  a life saver :->>>]]>

• Finally Errors.
– Error
– FatalErrors

6
XSLT and XPath

• What do we need to do the transformation?


– Xml Source document
– Transformation engine
– XSLT Stylesheet.

• What is XSLT?
• This is the template which defines the transformation rules and defines the target forrmat. This is
a variant of XSL language.

• How does XSLT work.


• XSLT defines a stylesheet template. For example <xsl:template
match=‘myElement’>MyTransformedElement</xsl:template>.

• Lets Make it little intelligent.


• <xsl:template match=‘myElement’><xsl:value-of select=‘.’/></xsl:template>

• Associating stylesheet with XML doc.


• This can be done using <?xml-stylesheet type=‘text/xsl’ href=‘stylesheet.xml’?>

• Why use XSLT?


• This is declarative programming.

7
XPath

• Whats a Node?
• Xpath uses the term “Node” to refer to any part of a document. It may be
an element, attribute or otherwise. “node-set” is a collection of nodes.

• Basics of Xpath.
– The document root is not the root element. It is the root of the document hierarchy.
– The xpath elements can be relative or absolute.
– @symbol is used to address attributes in the XML.
– // is used for recursive search. Example //SECT will return the SECT nodes irrespective
of where the current cursor in the document is.
– The location paths are not just on elements but can also match specific values using [].
Example : //PARA[SECT] or value based such as //PARA[@sensitive = ‘high’]

• Complex example :
select title from document where article/sect/para/sensitivity is high and
article/sect/sect/para/sensitivity is high.

8
Xpath Functions

• Node Functions

– name()
– node()
– processing-instruction()
– comment()
– text()

9
Xpath Functions

• Positional Functions

– position()
– last()
– count()

• Numeric functions

– number()
– sum()

• Boolean Functions

– boolean()
– not()
– True()
– False()

10
Xpath Functions

• String Functions

– String()
– String-length()
– Concat()
– Contains()
– Starts-with()
– Substring()
– Substring-after()
– Substring-before()
– Translate()

11
XSLT -Detail

• Conditional template matching. Multiple templates.


– <XSL:template match=“/[order]”> if the root element is order.
– <xsl:template match=“/”> if anything else.

• Default Templates.
– <xsl:template=“*|/”> this is for root element in case no root element template is defined.
– <xsl:template match=“text()|@”>

• Whats a context node?


• What ever element is matched in the template becomes the context node.

12
XSLT Elements

• <xslt:stylesheet> is the root element of the stylesheet and is used like this
– <xsl:stylesheet version=“1.0” xmlns:xsl=http://www.wrc.org/1999/XSL/transformation>

• <xsl:template> element is used to define the templates which make up XSLT


stylesheets. This is the start for a specific pattern or even for the root.

• <xsl:apply-templates> element is used from whithin a template to call other


templates.

• <xsl-value-of> it searches the context node for the value specified in the select
attribute xpath expression and inserts it into the result tree.

• <xsl:output> element allows to specify the method used and also gives a better
control over the way of the output.
– <xsl:output method=“xml” version=“1.0” encoding=“UTF-8” standalone=“yes”

• <xsl:element> element allows us to dynamically create element.


– <xsl:element name=“myNewElement”>Our value</xsl:element>

13
XSLT Elements

• <xsl:attribute> can be used to dynamically add attribute.


– <xsl:attribute name=“salutation”>Mr.</xsl:attribute>

• <xsl:text> can be used to insert text into the result tree.

• Conditional xsl elements.

• <xsl:if> It evaluates the expression in the test attribute, and if true this
element contents are executed.
– Example : <xsl:if test=“age[. &lt20]”>Boy</xsl:if>

• <xsl:choose>

• <xsl:for-each>

14
Namespaces

• Why do we need namespaces?


– The analogy of namespaces will be pacakeges.

• This can be solved using prefix.


– Is this feasible? Does it really solve the problem?

• How do name spaces solve the problem?


– Namespaces divide the name of the node into two. One being the local name and other being the
global name.

• Adding namespaces to the document.


– <stu:student xmlns:stu=“http://www.mycollege/student” >

• Default Namespace
– <student xmlns=“http://www.mycollege/student” >

• Attributes and namespaces.

15
Schemas

• Properties of Schema.
– Syntax XML
– Tools any XML tools
– DOMSupport Yes
– contentModels Strong- supports mixed content, can specify exact occurrences
– Datatypes Strong – Support all the commonly used datatypes
– Name scope Global and local namescope
– Inheritance Yes
– Extensibility Yes
– Multiple vocabulary Yes, supported with multiple schemas and namespaces.
– Dynamic Schemas Yes

16
Schemas

• Datatypes
– Primitive datatypes
– Derived datatypes

• Primitive types can be used for element or attribute values, they don’t have
child elements or attributes. Primitive types are built in

• A derived datatype is one that’s defined in terms of exiting datatypes.


Derived datatypes can have attributes, child elements or mixed content.

17
Schemas

• List of primitive Types.

– String
– boolean
– float
– double
– decimal
– timeduration
– recurringDuration
– binary
– uriReference
– ID
– IDREF
– ENTITY
– NOTATION
– QName

18
Schemas

• Built in Derived datatypes • Continued

– Language – en-us;en-uk etc – UnsignedLong


– Name – Legal XML 1.0 Name – Year
– NCName – Month
– Integer – Century
– NegetiveInteger – Date
– NONNegativeInteger – recurringDate
– NonPositiveInteger – recurringDay
– Byte – Time
– Short – TimeInstant
– Int – TimePeriod
– Long – IDRefs
– UnsignedByte – NMToken
– UnsignedShort – NMTokens
– Entities

19
Schemas

• datatypes false in one of 2 catogories


– atomic datatype
– List datatype

• Atomic datatype.
– An atomic type is one that has a value that cannot be divided when used without
extending.
• Example : <specialDay>10-10-2001</specialDay>
<account>10103453</account>

• List Datatype
– A list type has a value that’s comprised of a finite length sequence of atomic values.
• Example : <simpleType name=“height” base=“decimal” derivedBy=“List”>
<StudentsHeights>4 4.5 5 5.5 6 6.5</StudentHeights>

20
Aspects of Datatype

• All datatypes are comprised of three parts:


– A lexical space is a literal string representing values
– A values space the set of distinct values where each of the values is denoted by one or
more literals in the datatypes lexical space.
– A set of facets which are properties of the value space. Its individual values.

• Example : In the <Title> element the lexical and name representation is the
same.
• In case of number the lexical and value representation are different. The
lexical representation represents “12345” and the value representation is
equal to 12345.00.
• Facets are basically fundamental facets such as comparision and/or
constraining facets. A>B or A minIclusive=3

21
• Constraining Facets.
– Length, minLength,maxLength
– Pattern
– Enumeration
– minExclusive, maxInclusive, maxInclusive, maxExclusive
– Precision, scale
– Encoding
– Duration, period

22
Schemas

• Associating Schemas with XML Documents


– xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation=
http://mysite/myschema.XSD

• Multiple Schema within one XML document.


– xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation=
http://mysite/myschema.XSD http://mysite/myotherschema.XSD

• Declaring a schema
– <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">

23
Datatype Definitions

• Simple type
– Simple definition are how to create derived datatypes. Including those that were built
into the schema specifications

• Complex definitions
– Complex definitions are primarily used to describe content models.

24
Schemas

• Simple Type Definitions


– A Simple type definition is a set of constraints on the value spaces and the lexical space
of a datatype. Constraints can either be restrictions or the specification of the list type
that’s constrained by some other simple type.

• Syntax
– Name
– Base – optional
– Abstract – boolean – optional
– Derivedby – (list | restriction ) – OPTIONAL (default is restriction)

• Example
– <simpleType name=“course” base=“xsi:string”>
– <minLength value = “3”/>
– <maxLength value = “25”/>
– </simpleType>

25
Schema

• Example for derived by list


– <simpleType name=“ListofNOS” base=“xsi:integer derivedBy=“List”/>

– <ListofNOS>1 5 10 15 20 25 </ListofNOS>

26
Schema

• Element declaration – Example


– Name – <xsd:complexType name=“MyElement”
– Ref base=“xsd:string”
– Type derivedBy=“restriction”/>
• Attribute declaration
– minOccurs
– Name
– maxOccurs
– Default – Ref
– Type
– Fixed
– Use : [default / fixed / optional / required /
– Id
prohibited]
– Abstract
– Value
– Block
– Id
– Eqvicalss
– Form = [qualified or unqualified]
– Final
• Attribute groups
– Form
• Annotation
– nullable
– appInfo
– Documentation – not comment

27
Schema

• Complex type definition

• Syntax
– Name - OPTIONAL
– Base - OPTIONAL
– Abstract – boolean – OPTIONAL
– derivedBy – (extension | restriction)
– Content – (elementOnly | empty | mixed | textOnly )
– Block – “ ” or (#all | (extension | restriction)) – Optional
– Final – “ “ or (#all | (extension | restriction)) – Optional
• Cardinality
– Minoccurs maxOccurs
• <Choice> and <Sequence>
• <all> this element can be used to specify which child element can be present, with
out any specific sequence
• <any> This element can allow any content model with namespace such as other XML
or HTML.

28
Schemas

• XSD Patterns

• Catch – All element


– Users need to be able to insert marked up text into the document that the document
designer cannot foresee. For example, it is often necessary to have some presentation
specific markup inside of a document. If this unexpected markup is spread throughout
the document, then processors might have a hard time dealing with it.

• Collection Element
– Multiple elements of the same element type need to appear in the document as siblings
of each other. Often metadata about the container needs to be expressed as well. The
Collection Element is a logical place to put this metadata. Sometimes, the elements
need to be grouped into different categories. Multiple Collection Elements can appear in
the document, with each container having elements from one of the sub categories.

• Domain Element
– Every document has a domain, and every domain has unique concepts.

29
schemas

• Extensible Content Model


– A mechanism which allows additional elements to be added into existing content
models. The designer of the document can add a mechanism to allow the author of a
document instance to extend an element definition from the document type.

30

You might also like