version 11.3 (Modified)
4D includes a set of commands used for parsing objects containing XML (eXtensible Markup Language) data.
About the XML language
The XML language is a data exchange standard. It is based on the use of tags and enables precise description of the data exchanged as well as their structure. XML files are Text format files; their content is parsed by the applications importing the data. Many applications now support this format.
For more information about XML, refer, for instance, to the http://xml.org and http://www.w3.org sites.
For XML support, 4D uses a library named Xerces.dll developed by the Apache Foundation company. 4D supports XML version 1.0.
Note: 4D allows direct importing and exporting of data in XML format using the import/export editor.
DOM and SAX
The commands of this theme are prefixed DOM. In fact, 4D offers two separate sets of XML commands, prefixed DOM and SAX: DOM (Document Object Model) and SAX (Simple API XML) are two different parsing modes for XML documents.
The DOM mode parses an XML source and builds its structure (its "tree") in memory. Because of this, access to each element of the source is extremely fast. However, since the entire tree structure is stored in memory, the processing of large XML documents may lead to the memory capacity being exceeded and thus provoke errors.
The SAX mode does not build a tree structure in memory. In this mode, "events" (such as the start and end of an element) are generated when parsing the source. This mode lets you parse XML documents of any size, regardless of the amount of memory available. The SAX commands are grouped together in the "XML SAX" theme. For more information, please refer to the Overview of XML SAX Commands section.
For more information on XML standards, consult the following sites: http://www.saxproject.org/?selected=event and http://www.w3schools.com/xml/.
Creating, opening and closing XML documents via DOM
Objects created, modified or parsed by the 4D DOM commands can be text, URLs, documents or BLOBs. The DOM commands used for opening XML objects in 4D are DOM Parse XML source and DOM Parse XML variable.
Many commands then let you read, parse and write the elements and attributes. Errors are recovered using the GET XML ERROR command (common to both XML standards).
The DOM CLOSE XML command lets you close the source in the end.
Use of XPath notation (DOM)
Three XML DOM commands (DOM Create XML element, DOM Find XML element and DOM SET XML ELEMENT VALUE) accept XPath notation for accessing XML elements.
XPath notation comes from the XPath language, designed to navigate within XML structures. It allows the setting of elements directly within an XML structure via a "pathname" type syntax, without necessarily having to indicate the complete pathname in order to reach it. For example, given the following structure:
<RootElement> <Elem1> <Elem2> <Elem3 Font=Verdana Size=10> </Elem3> </Elem2> </Elem1> </RootElement>
XPath notation allows you to access element 3 using the /RootElement/Elem1/Elem2/Elem3 syntax.
4D also accepts indexed XPath elements using the Element[ElementNum] syntax. For example, given the following structure:
<RootElement> <Elem1> <Elem2>aaa</Elem2> <Elem2>bbb</Elem2> <Elem2>ccc</Elem2> </Elem1> </RootElement>
XPath notation allows you to access the "ccc" value using the /RootElement/Elem1/Elem2[3] syntax.
For an illustration of XPath notation, please refer to the examples in the DOM Create XML element and DOM Find XML element commands.
Terminology
The XML language uses a number of specific terms and acronyms. This non-exhaustive list details the main XML concepts used by the commands and functions of 4D.
Attribute: an XML sub-tag associated with an element. An attribute always contains a name and a value (see diagram below).
Child: In an XML structure, an element in a level directly below another.
DTD: Document Type Declaration The DTD records the set of specific rules and properties that the XML must follow. These rules define, more particularly, the name and content of each tag as well as its context. This formalization of the elements can be used to check whether an XML document is in compliance (in which case, it is declared "valid").
The DTD may be included in the XML document (internal DTD) or in a separate document (external DTD). Note that the DTD is not mandatory.
Element: an XML tag. An element always contains a name and a value. Optionally, an element may contain attributes (see diagram).
ElementRef: XML reference used by the 4D XML commands to specify an XML structure. This reference is made up of 8 coded characters in hexadecimal form, which means it consists of 16 characters.
Parent: In an XML structure, an element in a level directly above another.
Parsing, parser: The act of analyzing the contents of a structured object in order to extract useful information. The commands of the "XML" theme are used to parse the contents of any XML objects.
Root: An element located at the first level of an XML structure.
Sibling: In an XML structure, an element at the same level as another.
Structure XML: structured XML object. This object can be a document, a variable, or an element.
Validation: An XML document is "validated" by the parser when it is "well-formed" and in compliance with the DTD specifications. See also Well-formed.
Well-formed: An XML document is declared "well-formed" by the parser when it complies with the generic XML specifications. See also Validation.
XML: eXtensible Markup Language. A computerized data exchange standard enabling the transfer of data as well as their structure. The XML language is based on the use of tags and a specific syntax, in keeping with the HTML language. However, unlike the latter, the XML language allows the definition of customized tags.
XSL: eXtensible Stylesheet Language. A language permitting the definition of style sheets used to process and display the contents of an XSL document.