XML Class
Properties Methods Events Configuration Settings Errors
The XML class can be used to both parse and create XML documents.
Class Name
IPWorks_XML
Procedural Interface
ipworks_xml_open(); ipworks_xml_close($res); ipworks_xml_register_callback($res, $id, $function); ipworks_xml_get_last_error($res); ipworks_xml_get_last_error_code($res); ipworks_xml_set($res, $id, $index, $value); ipworks_xml_get($res, $id, $index); ipworks_xml_do_config($res, $configurationstring); ipworks_xml_do_endelement($res); ipworks_xml_do_flush($res); ipworks_xml_do_getattr($res, $attrname); ipworks_xml_do_hasxpath($res, $xpath); ipworks_xml_do_loaddom($res, $filename); ipworks_xml_do_loadschema($res, $schema); ipworks_xml_do_parse($res); ipworks_xml_do_putattr($res, $name, $namespaceuri, $value); ipworks_xml_do_putcdata($res, $text); ipworks_xml_do_putcomment($res, $text); ipworks_xml_do_putelement($res, $name, $namespaceuri, $value); ipworks_xml_do_putraw($res, $text); ipworks_xml_do_putstring($res, $text); ipworks_xml_do_removeattr($res, $attrname); ipworks_xml_do_removechildren($res); ipworks_xml_do_removeelement($res); ipworks_xml_do_reset($res); ipworks_xml_do_save($res); ipworks_xml_do_savedom($res, $filename); ipworks_xml_do_startelement($res, $name, $namespaceuri); ipworks_xml_do_tryxpath($res, $xpath);
Remarks
The XML class can operate as either a parser of writer of XML.
Parsing XML
The XML class parses XML documents and verifies that they are well-formed. The results are provided through a set of events complying with the SAX2 specification.
In addition, the document structure may be queried through an XPath mechanism that supports a subset of the XPath specification.
The parser is optimized for read applications, with a very fast engine that builds internal DOM structures with close to zero heap allocations. Additionally, BuildDOM can be set to False which reduces the overhead of creating the DOM and offers a fast forward-only parsing implementation which fires events to provide the parsed data.
When parsing a document events will fire to provide information about the parsed data. After Parse returns the document may be navigated by setting XPath if BuildDOM is True (default). If BuildDOM is False parsed data is only accessible through the events.
Events are fired only when qualifying conditions (such as, for example, the beginning of a new element) are met. In the meantime, text will be buffered internally. The following events will fire during parsing:
- Characters
- Comment
- EndElement
- EndPrefixMapping
- EvalEntity
- IgnorableWhitespace
- Meta
- PI
- SpecialSection
- StartElement
- StartPrefixMapping
If BuildDOM is True (default), XPath may be set after this method returns. XPath implements a subset of the XML XPath specification, allowing you to point to specific elements in the XML documents.
The path is a series of one or more element accessors separated by '/'. The path can be absolute (starting with '/') or relative to the current XPath location.
The following are possible values for an element accessor:
'name' | A particular element name |
name[i] | The i-th subelement of the current element with the given name |
[i] | The i-th subelement of the current element |
[last()] | The last subelement of the current element |
[last()-i] | The subelement located at the last location minus i in the current element |
name[@attrname="attrvalue"] | The subelement containing a particular value for a given attribute (supports single AND double quotes) |
.. | The parent of the current element |
BuildDOM must be set to True prior to parsing the document for the XPath functionality to be available.
Example (Setting XPath):
Document root | XML.XPath = "/" |
Specific Element | XML.XPath = "/root/SubElement1/SubElement2/" |
i-th Child | XML.XPath = "/root/SubElement1[i]" |
Input Properties
The class will determine the source of the input based on which properties are set.
The order in which the input properties are checked is as follows:
When a valid source is found the search stops.If parsing multiple documents call Reset between documents to reset the parser.
An additional "relaxed" mode allows for lexical parsing of non-XML documents (e.g. HTML). This is enabled by setting Validate to False. In this case, events will be fired as elements, entities, etc. are encountered, but the structure of the document will not be checked for "well-formedness", and the internal DOM structure will not be built.
Writing XML
To use the class first decide whether or not to write to file, or to OutputData.
Output Properties
The class will determine the destination of the output based on which properties are set.
The order in which the output properties are checked is as follows:
- OutputFile
- OutputData: The output data is written to this property if no other destination is specified.
To begin writing the XML document first optionally set XMLDeclaration. If this is not set the class will use a default XML declaration at the beginning of the document.
Next begin adding elements to your document. Calling StartElement will open an element with the specified name. To create a nested structure continue calling StartElement to open more child elements. To write a value within an element call PutString. To close the element that was last opened call EndElement. Each time EndElement is called the element at the current level is closed. Alternatively, calling PutElement will write the element specified with the value specified and will also close the element.
To write an attribute of the current element, after calling StartElement call PutAttr. Call PutAttr multiple times to add multiple attributes.
Writing comments or CDATA can be done at any time with the PutComment and PutCData methods.
To close your XML document call Save. You can call Save from any location and it will close any remaining open elements automatically.
Property List
The following is the full list of the properties of the class with short descriptions. Click on the links for further details.
BuildDOM | When True, an internal object model of the XML document is created. |
InputData | The XML data to parse. |
InputFile | The file to process. |
NamespaceCount | The number of records in the Namespace arrays. |
NamespacePrefix | The Prefix for the Namespace . |
NamespaceURI | Namespace URI associated with the corresponding Prefix . |
OutputData | The output XML after processing. |
OutputFile | The path to a local file where the output will be written. |
Overwrite | Indicates whether or not the class should overwrite files. |
Validate | When True, the parser checks that the document consists of well-formed XML. |
AttrCount | The number of records in the Attr arrays. |
AttrName | The Name provides the local name (without prefix) of the attribute. |
AttrNamespace | Attribute namespace. |
AttrPrefix | Attribute prefix (if any). |
AttrValue | Attribute value. |
XChildCount | The number of records in the XChild arrays. |
XChildName | The Name property provides the local name (without prefix) of the element. |
XChildNamespace | Namespace of the element. |
XChildPrefix | Prefix of the element (if any). |
XChildXText | The inner text of the element. |
XCommentCount | The number of records in the XComment arrays. |
XCommentText | This property holds the comment text. |
XElement | The name of the current element. |
XMLDeclarationEncoding | This property specifies the XML encoding to use. |
XMLDeclarationStandalone | This property indicates whether the standalone attribute is present in the declaration with a value of true. |
XMLDeclarationVersion | This property specifies the XML version. |
XNamespace | The namespace of the current element. |
XParent | The parent of the current element. |
XPath | Provides a way to point to a specific element in the document. |
XPrefix | The prefix of the current element. |
XSubTree | A snapshot of the current element in the document. |
XText | The text of the current element. |
Method List
The following is the full list of the methods of the class with short descriptions. Click on the links for further details.
Config | Sets or retrieves a configuration setting. |
EndElement | Writes the closing tag of an open XML element. |
Flush | Flushes the parser and checks its end state. |
GetAttr | Returns the value of the specified attribute. |
HasXPath | Determines whether a specific element exists in the document. |
LoadDOM | Loads the DOM from a file. |
LoadSchema | Load the XML schema. |
Parse | This method parses the specified XML data. |
PutAttr | Writes an XML attribute. |
PutCData | Writes an XML CDATA block. |
PutComment | Writes an XML comment block. |
PutElement | Writes a simple XML element with a value. |
PutRaw | Writes a raw XML fragment. |
PutString | Writes text inside an XML element. |
RemoveAttr | Remove a attribute. |
RemoveChildren | Removes the children of the elment as the specified XPath. |
RemoveElement | Removes the elment as the specified XPath. |
Reset | Resets the parser. |
Save | Closes the class writing stream. |
SaveDOM | Saves the DOM to a file. |
StartElement | Writes the opening tag of an XML element. |
TryXPath | Navigates to the specified XPath if it exists. |
Event List
The following is the full list of the events fired by the class with short descriptions. Click on the links for further details.
Characters | Fired for plain text segments of the input stream. |
Comment | Fired when a comment section is encountered. |
EndElement | Fired when an end-element tag is encountered. |
EndPrefixMapping | Fired when leaving the scope of a namespace declaration. |
Error | Information about errors during data delivery. |
EvalEntity | Fired every time an entity needs to be evaluated. |
IgnorableWhitespace | Fired when a section of ignorable whitespace is encountered. |
Meta | Fired when a meta section is encountered. |
PI | Fired when a processing instruction section is encountered. |
SpecialSection | Fired when a special section is encountered. |
StartElement | Fired when a begin-element tag is encountered in the document. |
StartPrefixMapping | Fired when entering the scope of a namespace declaration. |
XML | Fires as XML is written. |
Configuration Settings
The following is a list of configuration settings for the class with short descriptions. Click on the links for further details.
CacheContent | If true, the original XML is saved in a buffer. |
Charset | Specifies the charset used when encoding data. |
EOL | The characters to use for separating lines. |
ErrorOnEmptyAttr | If true, passing an invalid attribute to the Attr method will throw an exception. |
ExtraNameChars | Extra characters for the parser to consider as name characters. |
ExtraSpaceChars | Extra characters for the parser to consider as white space. |
FlushOnEOL | If set, the parser flushes its text buffer after every line of text. |
IgnoreBadAttributePrefixes | If true, bad (unknown) attribute prefixes are ignored. |
IgnoreBadElementPrefixes | If true, bad (unknown) element prefixes are ignored. |
IncludeElementPrefix | Whether to include the prefix in the element name. |
IncludeXMLDeclaration | Whether to include the XML declaration when writing XML. |
Indent | The characters to use for each indentation level. |
Offset | Current offset of the document being parsed. |
PreserveWhitespace | If true, leading and trailing whitespace in element text is preserved. |
QuoteChar | Quote character to use for attribute values. |
StringProcessingOptions | Defines options to use when processing string values. |
BuildInfo | Information about the product's build. |
CodePage | The system code page used for Unicode to Multibyte translations. |
LicenseInfo | Information about the current license. |
ProcessIdleEvents | Whether the class uses its internal event loop to process events when the main thread is idle. |
SelectWaitMillis | The length of time in milliseconds the class will wait when DoEvents is called if there are no events to process. |
UseInternalSecurityAPI | Tells the class whether or not to use the system security libraries or an internal implementation. |