www.krengeltech.com

Intro to XPath

From Wiki

What is XPath?

User Guide

The term XPath describes the location of an entity in an XML document. You can think of an entity as being one of the things that make up an XML document (element tags, element content, attributes, and the like).

Xpath’s notation looks similar to an RPG qualified data structure or the path to a document on your desktop's C:\ drive. Take the following XML document:

  <PostAdr residential="true">
    <name title="Mr.">
      <first>Aaron</first>
      <last>Bartell</last>
    </name>
    <street>123 Center Rd</street>
    <cty>Mankato</cty>
    <state>MN</state>
    <zip>56001</zip>
    <phone>123-123-1234</phone>
    <phone>321-321-4321</phone>
  </PostAdr>


In order to verbally describe the location of the <first> element in a qualified manner, we could say, “<first> is a child of <name> and <name> is a child of <PostAdr>.” For use in a program, the XPath describing that relationship would be:


  /PostAdr/name/first


The XPath /PostAdr/zip refers to the <zip> tag. An XPath of /PostAdr/name@title refers to the attribute title in <name>. More on the usage of the @ symbol shortly. An RPG programmer will use XPaths to tell an XML parser which events should generate notifications as the parser crawls through the document.

Defining a DOM XPath

XPath values must take the following format:


  /element-name[/element-name...][/|@attribute-name]


In other words, XPath values must follow these rules:

  1. An XPath must start with a slash
  2. A slash must appear between element names
  3. An XPath must not contain any whitespace
  4. An XPath must end with a slash (if specifying an element) OR an @-symbol followed by an attribute name

Examples of valid XPath names are:


  /PostAdr/
  /PostAdr/name/first/
  /PostAdr/name@title


Examples of invalid XPath names are:


  /PostAdr              (no trailing slash)
  /PostAdr / name/first (no trailing slash and whitespace exists)
  /PostAdr/name/@title  (slash before @ symbol)


Additionally, we may refer to the components of an XPath as XPath 'segments' – these may consist of a single node, or they may consist of multiple nodes. For instance, the following XPath:


  /PostAdr/phone@type


consists of several nodes:


  /PostAdr/
  /phone/
  @type


When calling RXS_DOMGetData or RXS_DOMGetDataCount, for instance, each alternating character parameter (parameters 1, 3, 5 etc.) would consist of subsequent XPath segments, each including their own leading and trailing slashes (except that a final attribute segment would include only a leading @ symbol).

Therefore, we may process the above 3-node XPath in several ways:


     RXS_DOMgetData( '/PostAdr/phone/' : 1 : '@type' )
     RXS_DOMgetData( '/PostAdr/' : 1 : '/phone/' : 2 : '@type' )
     RXS_DOMgetData( '/PostAdr/phone/@type' )


For example, consider the XML at the top of this document. The RXS_DOMgetData procedure could be used as follows


     RXS_DOMgetData( '/PostAdr@residential' )             returns 'true'
     RXS_DOMgetData( '/PostAdr/name/' : 1 : '/first/' )   returns 'Mickey'
     RXS_DOMgetData( '/PostAdr/name/' : 2 : '/first/' )   returns 'Minnie'
     RXS_DOMgetData( '/PostAdr/name/' : 2 : '@title' )    returns 'Ms.'


The following code could be used to list all phone numbers:


     nbrPhones = RXS_DOMgetDataCount('/PostAdr/phone/');
     for idx = 1 to nbrPhones;
       string = RXS_DOMgetData( '/PostAdr/phone/' : idx ) +
       ' ('
       RXS_DOMgetData( '/PostAdr/phone/' : idx : '@type' ) +
       ')';
       dsply string;
     endfor;


which would DSPLY the following:


  '123-123-1234 (phone)'
  '321-321-4321 (fax)'