[All Packages]  [This Package]

Class: XMLParser

This class contains top-level methods for invoking the parser and returning high-level information about a document.


Method Index

xmlinit Initialize XML parser
xmlinitenc Initialize XML parser (specifying DOM data encoding)
xmlterm Terminate XML parser
xmlclean Clean up memory used during parse
xmlparse Parse a document from a URI
xmlparseBuffer Parse a document from a buffer
xmlparseDTD Parse an external DTD
xmlparseFile Parse a document from a file
xmlparseStream Parse a document from a stream
xmlwhere Return error location information
setAccess Set I/O access callbacks
getContent Returns the content model for an element
getDocument Returns the root node of a parsed document
getDocumentElement Returns the root element (node) of a parsed document
getDocType Returns the document type string
isStandalone Returns the value of the standalone flag
isSingleChar Returns the value of the single/multibyte encoding flag
getEncoding Returns the name of the document's encoding method
validate Validate the document
context Get the context
createDocument Creates and returns a document node

Methods

xmlinit, xmlinitenc

Function:
Initialize XML parser

p

Prototype:
uword xmlinit(DOMString incoding,
              void (*msghdlr)(void *msgctx, DOMString msg, ub4 errcode),
              void *msgctx, lpxsaxcb *saxcb, void *saxcbctx, DOMString lang)

uword xmlinitint(DOMString incoding, DOMString outcoding,
                 void (*msghdlr)(void *msgctx, DOMString msg, ub4 errcode),
                 void *msgctx, lpxsaxcb *saxcb, void *saxcbctx, DOMString lang)

Arguments:
incoding -- Default input file encoding (UTF8 if not specified)
outcoding -- Output (DOM) encoding for document data (same as first input, if not specified)
msghdlr -- Error message callback
msgctx -- User-defined context pointer passed to msghdlr
saxcb -- SAX callback structure (iff using SAX)
saxcbctx -- User-defined SAX context structure passed to SAX callback functions
lang -- Language for error message (not used)

Returns:
uword -- Numeric error code, 0 meaning success

xmlterm

Function:
Terminate XML parser, tear down, free memory, etc

Prototype:
void xmlterm()

Arguments:
None

Returns:
void


xmlclean

Function:
This function frees any memory used during the previous parse. Recycles memory within the XML parser, but does not free it to the system-- only xmlterm() finally releases all memory back to the system. If xmlclean() is not called between parses, then the data used by the previous documents remains allocated, and pointers to it are valid. Thus, the data for multiple documents can be accessible simultaneously, although only the current document can be manipulated with DOM. If you just want to access one document's data at a time (within a single context), then call xmlclean before each new parse.

Prototype:
void xmlclean()

Arguments:
None

Returns:
void


xmlparse

Function:
Parses a document

Prototype:
uword xmlparse(DOMString doc, DOMString encoding, ub4 flags)

Arguments:
doc -- document path
encoding -- document's encoding
flags -- Mask of flag bits

Flags:
XML_FLAG_VALIDATE -- Validate document against DTD
XML_FLAG_DISCARD_WHITESPACE -- Discard ignorable whitespace

Returns:
uword -- Error code, 0 on success


xmlparseBuffer

Function:
Parses a document

Prototype:
uword xmlparseBuffer(DOMString buffer, size_t len, DOMString encoding, ub4 flags)

Arguments:
buffer -- buffer containing document to parse
len -- length of document
encoding -- document's encoding
flags -- Mask of flag bits

Flags:
XML_FLAG_VALIDATE -- Validate document against DTD
XML_FLAG_DISCARD_WHITESPACE -- Discard ignorable whitespace

Returns:
uword -- Error code, 0 on success


xmlparseDTD

Function:
Parses a DTD

Prototype:
uword xmlparseDTD(DOMString uri, DOMString name, DOMString encoding, ub4 flags)

Arguments:
uri -- URI pointing to DTD
name -- DTD name
encoding -- DTD's encoding
flags -- Mask of flag bits

Returns:
uword -- Error code, 0 on success


xmlparseFile

Function:
Parses a document from a file

Prototype:
uword xmlparsefile(DOMString path, size_t len, DOMString encoding, ub4 flags)

Arguments:
path -- document path
len -- unused parameter
encoding -- document's encoding
flags -- Mask of flag bits

Flags:
XML_FLAG_VALIDATE -- Validate document against DTD
XML_FLAG_DISCARD_WHITESPACE -- Discard ignorable whitespace

Returns:
uword -- Error code, 0 on success


xmlparseStream

Function:
Parses a document from a stream

Prototype:
uword xmlparsestrem(DOMString path, void *stream, DOMString encoding, ub4 flags)

Arguments:
path -- unused argument
stream -- input stream
encoding -- document's encoding
flags -- Mask of flag bits

Flags:
XML_FLAG_VALIDATE -- Validate document against DTD
XML_FLAG_DISCARD_WHITESPACE -- Discard ignorable whitespace

Returns:
uword -- Error code, 0 on success


xmlwhere

Function:
Return error location information. Should only be called from within user error callback function (while error is current)

Prototype:
boolean xmlwhere(ub4 *line, DOMString *path, uword idx)

Arguments:
line -- returned line# where error occured
path -- return path/URL where error occured
idx -- position in error stack (starting at 0)

Returns:
boolean -- idx is valid, location returned

setAccess

Function:
Sets the I/O callback functions for the given access method.

Prototype:
uword xmlaccess(xmlctx *ctx, xmlacctype access, XML_OPENF((*openf)), XML_CLOSEF((*closef)), XML_READF((*readf)));

Arguments:
ctx -- The XML context
access -- access method enum, XMLACCESS_xxx
openf -- Open-input callback function
closef -- Close-input callback function
readf -- Read-input callback function

Returns:
uword -- Error code, 0 on success

Comments:
Sets the I/O callback functions for the given access method. Most methods have built-in callback functions, so none be provided by the user. The notable exception is XMLACCESS_STREAM, user-defined streams, where the user must set the stream callback functions themselves.

The three callback functions are invoked to open, close, and read from the input source. The functions should have been declared using the the function prototype macros XML_OPENF, XML_CLOSEF and XML_READF.

XML_OPENF is the open function, called once to open the input source. It should set its persistent handle in the xmlihdl union, which has two choices, a generic pointer (void *), and an integer (as unix file or socket handle). This function must return XMLERR_OK on success. Args:

 ctx    (IN)  - XML context
 path   (IN)  - full path to the source to be opened
 parts  (IN)  - path broken down into components; opaque pointer
 length (OUT) - total length of input source, if known.
                if not known, should be set to 0.
 ih     (OUT) - the opened handle is placed here
XML_CLOSEF is the close function; it closes an open source and frees resources. Args:

 ctx    (IN) - XML context
 ih     (IN) - input handle union
XML_READF is the reader function; it reads data from an open source into a buffer, and returns the number of bytes read:

  • If <= 0, an EOI condition is indicated.
  • If > 0, then the EOI flag determines if this's the terminal data.

On EOI, the matching close function will be called automatically. Args:

XML_READF is the reader function; it reads data from an open source into a buffer. When the input is exhausted, the reader may close the input at that time, or wait until the close function is called later. Args:

 ctx      (IN)  - XML context
 path     (IN)  - full path to the source to be opened; only
                  provided here for use in error messages
 ih       (IN)  - input handle union
 dest     (OUT) - destination buffer to read data into
 destsize (IN)  - size of dest
 nraw     (OUT) - number of bytes read
 eoi      (OUT) - hit End of Information?  should be set
                  to TRUE or FALSE after each read.

getContent

Function:

Returns the content model for a node. Content model nodes are Nodes and can be traversed and examined with the same functions as the parsed document.

Prototype:

Node* getContent(Node *node)

Arguments:

node -- node whose content model to return
Returns:

Node* -- root node of content model tree


getDocument

Function:

After a document has been successfully parsed, returns a pointer to the root node of the document. Compare with getDocumentElement which returns the root element node.

Prototype:

Node* getDocument()

Arguments:

None

Returns:

Node* -- Pointer to root node of document


getDocumentElement

Function:

After a document has been successfully parsed, returns a pointer to the root element (node) of the document

Prototype:

Element* getDocumentElement()

Arguments:

None

Returns:

Element* -- Pointer to root element (node) of document


getDocType

Function:

Returns a pointer to a "DocType" structure which describes the DTD

Prototype:

DocumentType* getDocType()

Arguments:

None

Returns:

DocumentType* -- Pointer to DTD descriptor


isStandalone

Function:

Returns TRUE if the document is specified as standalone on the <?xml?> line, FALSE otherwise

Prototype:

boolean isStandalone()

Arguments:

None

Returns:

boolean -- Value of standalone flag


isSingleChar

Function:

Returns TRUE if the document is single-byte encoded (i.e. ASCII), or FALSE if it's multi-byte encoded (UTF-8, etc).

Prototype:

boolean isSingleChar()

Arguments:

None

Returns:

boolean -- Value of single/multibyte encoding flag


getEncoding

Function:

Returns the name of the document's encoding method, e.h. "ASCII", "UTF8", etc. Compare to isSingleChar flag which just tells whether it's a single or multibyte encoding.

Prototype:

DOMString getEncoding()

Arguments:

None

Returns:

DOMString -- Name of encoding method


validate

Function:

Validate the document.

Prototype:

uword validate(Node *root)

Arguments:

root -- document node to validate

Returns:

uword -- Error code, 0 on success


context

Function:

Get the context.

Prototype:

xmlctx* context()

Arguments:

None

Returns:

xmlctx* -- context


createDocument

Function:

Creates and returns a DOCUMENT node. When dtd is not null, its Node.ownerDocument attribute is set to the document being created.

Prototype:

Document *createDocument(DOMString uri, DOMString qname, DocumentType *dtd)

Arguments:

uri -- namespace URI of the new document element
qname -- qualified name of the new document element
dtd -- document type (DTD)

Returns:

Document* -- pointer to created Document node