XML Component
Properties Methods Events Configuration Settings Errors
The XML component can be used to both parse and create XML documents.
Syntax
IPWorks.Xml
Remarks
The XML component can operate as either a parser of writer of XML.
Parsing XML
The XML component parses XML documents and verifies that they are well-formed. The results are provided through a set of events complying with the SAX2 specification.
In addition, the document structure may be queried through an XPath mechanism that supports a subset of the XPath specification.
The parser is optimized for read applications, with a very fast engine that builds internal DOM structures with close to zero heap allocations. Additionally, BuildDOM can be set to False which reduces the overhead of creating the DOM and offers a fast forward-only parsing implementation which fires events to provide the parsed data.
When parsing a document events will fire to provide information about the parsed data. After Parse returns the document may be navigated by setting XPath if BuildDOM is True (default). If BuildDOM is False parsed data is only accessible through the events.
Events are fired only when qualifying conditions (such as, for example, the beginning of a new element) are met. In the meantime, text will be buffered internally. The following events will fire during parsing:
- Characters
- Comment
- EndElement
- EndPrefixMapping
- EvalEntity
- IgnorableWhitespace
- Meta
- PI
- SpecialSection
- StartElement
- StartPrefixMapping
If BuildDOM is True (default), XPath may be set after this method returns. XPath implements a subset of the XML XPath specification, allowing you to point to specific elements in the XML documents.
The path is a series of one or more element accessors separated by '/'. The path can be absolute (starting with '/') or relative to the current XPath location.
The following are possible values for an element accessor:
'name' | A particular element name |
name[i] | The i-th subelement of the current element with the given name |
[i] | The i-th subelement of the current element |
[last()] | The last subelement of the current element |
[last()-i] | The subelement located at the last location minus i in the current element |
name[@attrname="attrvalue"] | The subelement containing a particular value for a given attribute (supports single AND double quotes) |
.. | The parent of the current element |
BuildDOM must be set to True prior to parsing the document for the XPath functionality to be available.
Example (Setting XPath):
Document root | XML.XPath = "/" |
Specific Element | XML.XPath = "/root/SubElement1/SubElement2/" |
i-th Child | XML.XPath = "/root/SubElement1[i]" |
Input Properties
The component will determine the source of the input based on which properties are set.
The order in which the input properties are checked is as follows:
When a valid source is found the search stops.If parsing multiple documents call Reset between documents to reset the parser.
An additional "relaxed" mode allows for lexical parsing of non-XML documents (e.g. HTML). This is enabled by setting Validate to False. In this case, events will be fired as elements, entities, etc. are encountered, but the structure of the document will not be checked for "well-formedness", and the internal DOM structure will not be built.
Writing XML
To use the component first decide whether or not to write to file, a stream, or to OutputData.
Output Properties
The component will determine the destination of the output based on which properties are set.
The order in which the output properties are checked is as follows:
- SetOutputStream
- OutputFile
- OutputData: The output data is written to this property if no other destination is specified.
To begin writing the XML document first optionally set XMLDeclaration. If this is not set the component will use a default XML declaration at the beginning of the document.
Next begin adding elements to your document. Calling StartElement will open an element with the specified name. To create a nested structure continue calling StartElement to open more child elements. To write a value within an element call PutString. To close the element that was last opened call EndElement. Each time EndElement is called the element at the current level is closed. Alternatively, calling PutElement will write the element specified with the value specified and will also close the element.
To write an attribute of the current element, after calling StartElement call PutAttr. Call PutAttr multiple times to add multiple attributes.
Writing comments or CDATA can be done at any time with the PutComment and PutCData methods.
To close your XML document call Save. You can call Save from any location and it will close any remaining open elements automatically.
Property List
The following is the full list of the properties of the component with short descriptions. Click on the links for further details.
BuildDOM | When True, an internal object model of the XML document is created. |
InputData | The XML data to parse. |
InputFile | The file to process. |
Namespaces | Collection of namespaces in the current namespace stack. |
OutputData | The output XML after processing. |
OutputFile | The path to a local file where the output will be written. |
Overwrite | Indicates whether or not the component should overwrite files. |
Validate | When True, the parser checks that the document consists of well-formed XML. |
XAttributes | A collection of attributes of the current element. |
XChildren | Collection of child elements of the currently selected XML element. |
XComments | A collection of comments of the current element. |
XElement | The name of the current element. |
XMLDeclaration | Specifies details of the XML declaration. |
XNamespace | The namespace of the current element. |
XParent | The parent of the current element. |
XPath | Provides a way to point to a specific element in the document. |
XPrefix | The prefix of the current element. |
XSubTree | A snapshot of the current element in the document. |
XText | The text of the current element. |
Method List
The following is the full list of the methods of the component with short descriptions. Click on the links for further details.
config | Sets or retrieves a configuration setting. |
endElement | Writes the closing tag of an open XML element. |
flush | Flushes the parser and checks its end state. |
getAttr | Returns the value of the specified attribute. |
hasXPath | Determines whether a specific element exists in the document. |
loadDOM | Loads the DOM from a file. |
loadSchema | Load the XML schema. |
parse | This method parses the specified XML data. |
putAttr | Writes an XML attribute. |
putCData | Writes an XML CDATA block. |
putComment | Writes an XML comment block. |
putElement | Writes a simple XML element with a value. |
putRaw | Writes a raw XML fragment. |
putString | Writes text inside an XML element. |
removeAttr | Remove a attribute. |
removeChildren | Removes the children of the elment as the specified XPath. |
removeElement | Removes the elment as the specified XPath. |
reset | Resets the parser. |
save | Closes the component writing stream. |
saveDOM | Saves the DOM to a file. |
setInputStream | Sets the stream from which the component will read data to parse. |
setOutputStream | The stream to which the component will write the XML. |
startElement | Writes the opening tag of an XML element. |
tryXPath | Navigates to the specified XPath if it exists. |
Event List
The following is the full list of the events fired by the component with short descriptions. Click on the links for further details.
Characters | Fired for plain text segments of the input stream. |
Comment | Fired when a comment section is encountered. |
EndElement | Fired when an end-element tag is encountered. |
EndPrefixMapping | Fired when leaving the scope of a namespace declaration. |
Error | Information about errors during data delivery. |
EvalEntity | Fired every time an entity needs to be evaluated. |
IgnorableWhitespace | Fired when a section of ignorable whitespace is encountered. |
Meta | Fired when a meta section is encountered. |
PI | Fired when a processing instruction section is encountered. |
SpecialSection | Fired when a special section is encountered. |
StartElement | Fired when a begin-element tag is encountered in the document. |
StartPrefixMapping | Fired when entering the scope of a namespace declaration. |
XML | Fires as XML is written. |
Configuration Settings
The following is a list of configuration settings for the component with short descriptions. Click on the links for further details.
CacheContent | If true, the original XML is saved in a buffer. |
Charset | Specifies the charset used when encoding data. |
CloseInputStreamAfterProcess | Determines whether or not the input stream is closed after processing. |
CloseOutputStreamAfterProcess | Determines whether or not the output stream is closed after processing. |
EOL | The characters to use for separating lines. |
ErrorOnEmptyAttr | If true, passing an invalid attribute to the Attr method will throw an exception. |
ExtraNameChars | Extra characters for the parser to consider as name characters. |
ExtraSpaceChars | Extra characters for the parser to consider as white space. |
FlushOnEOL | If set, the parser flushes its text buffer after every line of text. |
IgnoreBadAttributePrefixes | If true, bad (unknown) attribute prefixes are ignored. |
IgnoreBadElementPrefixes | If true, bad (unknown) element prefixes are ignored. |
IncludeElementPrefix | Whether to include the prefix in the element name. |
IncludeXMLDeclaration | Whether to include the XML declaration when writing XML. |
Indent | The characters to use for each indentation level. |
Offset | Current offset of the document being parsed. |
PreserveWhitespace | If true, leading and trailing whitespace in element text is preserved. |
QuoteChar | Quote character to use for attribute values. |
StringProcessingOptions | Defines options to use when processing string values. |
BuildInfo | Information about the product's build. |
GUIAvailable | Tells the component whether or not a message loop is available for processing events. |
LicenseInfo | Information about the current license. |
UseDaemonThreads | Whether threads created by the component are daemon threads. |
UseInternalSecurityAPI | Tells the component whether or not to use the system security libraries or an internal implementation. |