XML for <SCRIPT> Cross Platform XML Parsing in JavaScript

Documentation - FAQ

CONTENTS: How does XML for <SCRIPT> compare to XML Data Islands?; What about document.implementation.createDocument and MSXML?; Which DOM Parser should be used?; How fast are XML for <SCRIPT>'s parsers?; How do I get the XML to the browser?; What license if XML for <SCRIPT> released under?; Help! Netscape 6 doesn't send my XML to the server!; Does XML for <SCRIPT> work in Konqueror/Safari?; Does XML for <SCRIPT> work in Opera?; Is there any type of error handling?; Are there any known bugs?; Why does the Classic DOM parser always return a new XMLDoc object whenever it manipulates the tree?; The .js files are huge! Are there smaller ones available?

How does XML for <SCRIPT> compare to XML Data Islands?

There are a number of significant advantages to XML for <SCRIPT> over XML Data Islands and MSXML.

1) XML for <SCRIPT> uses ECMA and W3C Standards, XML Data Islands do not.

This means that you will not be limiting your browser base to a single vendor, single browser solution. XML for <SCRIPT> has been tested to work with Netscape, Mozilla, Konqueror, Safari, Opera and Internet Explorer. XML Data Islands will only work with Internet Explorer, and only then on certain versions.

2) XML for <SCRIPT> does not pose security risks, XML Data Islands can.

XML Data Islands are usually used with the MSXML Parser. MSXML is an ActiveX object, which requires ActiveX scripting be enabed in the browser. Microsoft has a long history of security holes with this technology as it basically allows administrator access through the browser to the client's machine on many Windows platforms. It is possible to somewhat lock down IE to make this difficult, but that process must be done on each machine that is to use your application. This can create an administration nightmare.

XML for <SCRIPT>'s technology works within the browser's security sandbox and has none of these issues.

3) XML for <SCRIPT> does not require new software to be installed on the client machine, XML Data Islands and MSXML often do.

The MSXML control that XML Data Islands commonly use is not guaranteed to be installed on client machines. This means that you must ensure that these controls are available for the browser to auto-install. This auto-install process is by no means foolproof and can break other applications on the system. Often, updates to these components require users to clear their component cache (a long and rather obtuse process) before downloading the components again to get the updates. Again, this can create an administration nightmare.

XML for <SCRIPT> does not effect any other part of the user's system and needs no installation. Since XML for <SCRIPT> is not distributed in binary form (as opposed to MSXML), fixes to your application need only be propagated to one place (the server) rather than out to each client.

4) XML for <SCRIPT> is open source and Free software, XML Data Islands and MSXML are not.

When you download XML for <SCRIPT> you get full access to the source code to see how things are implemented. You are also free to modify, fix and change the software as you see fit as described by the LGPL license. XML Data Islands do not give you this freedom.

To be fair, in a controlled IE only environment, XML Data Islands used with MSXML may be a viable solution. However, tying yourself strictly to the Microsoft platform is a choice that must be very carefully made. Especially considering the threat of a Business Software Alliance audit, arbitrary licensing changes, security/administration problems and cost. XML for <SCRIPT> frees you from these concerns and allows you to focus on the task at hand.

What about document.implementation.createDocument and MSXML?

Browser technology is advancing to the point where it is conceivable that in the near future, JavaScript handling of XML will be accomplished via the browser's own XML parser. Today, Konqueror, Mozilla and Internet Explorer for Windows allow you to create instances of their XML DOM parsers, load data and manipulate XML directly from JavaScript. The manner in which you create the XML Parser is different between the two browsers, but the interfaces to the parsers themselves complies with the W3C standards.

Konqueror (versions 3.2 and above) and Mozilla use the document.implementation.createDocument W3C standard to create a custom XML DOM object. To populate this object with data, both Konqueror and Mozilla have extended the W3C specification to add a load() method which allows an XML file to be loaded into the parser.

Internet Explorer for Windows uses a Windows-only ActiveX object to allow JavaScript access to an XML DOM tree. While the creation of this ActiveX object uses code that is proprietary to Internet Explorer for Windows, once the MSXML object is created, it exposes a W3C interface to JavaScript which allows for cross platform code to be written.

How does XML for <SCRIPT> compare to these technologies? Very well! Most importantly, XML for <SCRIPT> works in a wider range of browsers than either of these solutions. XML for <SCRIPT> also allows JavaScript developers to enhance and extend the parser in any way they see fit. This option is difficult in Mozilla and impossible for Internet Explorer. XML for <SCRIPT> also provides server-side proxies which allow for XML code to be loaded from any domain on the web. Additionally, XML for <SCRIPT> also provides SAX XML processing capabilities - a feature not available otherwise.

Over time, as more and more browsers add support for JavaScript XML parsing, XML for <SCRIPT> may become unnecessary for complicated web application development. In the mean time, however, it serves an important role in the process of creating robust, cross platform web applications.

Which DOM Parser should be used?

XML for <SCRIPT> includes two DOM parsers:

XML for <SCRIPT>'s W3C DOM Parser attempts to be as compliant to the W3C XML DOM specification as possible. For developers familiar with the W3C XML DOM methods, the W3C DOM Parser is the hands-down the best choice for application development.

Additionally, XML for <SCRIPT>'s W3C DOM Parser will support W3C standards such as XPath and XSL whilst the Classic DOM Parser will not be upgraded to do so. XML for <SCRIPT>'s W3C DOM Parser works with generation 5 browsers and above only.

XML for <SCRIPT>'s Classic DOM Parser is the DOM Parser that shipped with the original version of XML for <SCRIPT>. This DOM Parser has a simple, but proprietary, API that is much more limited than XML for <SCRIPT>'s W3C DOM Parser. XML for <SCRIPT>'s Classic DOM Parser is also compatible with older (generation 4) browsers.

XML for <SCRIPT>'s Classic DOM Parser does not support XPath or XSL. It does, however, support a simple, proprietary TagPath searching API that resembles XPath in its syntax.

XML for <SCRIPT>'s Classic DOM was deprecated with the 3.0 release. No further development of the Classic DOM Parser is planned, although any contributions from users will be happily integrated.

Going forward, XML for <SCRIPT>'s W3C DOM Parser will be the recommended DOM Parser for application development. Support and bug fixes will continue for the Classic DOM Parser into the foreseeable future, but enhancement requests will likely not be fulfilled by the XML for <SCRIPT> development team.

The choice of which parser to use is fully up to the developer. The W3C Parser is much more compliant to standards than the Classic DOM and will be under the most development in the future. The Classic DOM has a much simpler API, but it is not standards compliant. It will never be removed from the XML for <SCRIPT> distribution, but it will also not be developed beyond its existing state unless done so by an outside contributor. The W3C Parser's (compressed) download size is roughly 60k larger than the Classic DOM Parser.

How fast are XML for <SCRIPT>'s parsers?

XML for <SCRIPT> operates inside an interpreted JavaScript environment. For most XML data sets, this does not pose a performance problem. As data sets grow, however this environment starts to become more and more of an issue.

In practice, XML for <SCRIPT>'s acceptable performance (for both parsers) starts to top out for data sets of 1500 nodes with one attribute each. Ad-hoc testing of this situation yields load times of about 9 seconds on an Athlon 1.1 GHz using Linux and Mozilla 1.6. Other browsers and operating systems may load faster or slower depending on their environment.

If you are planning on using XML for <SCRIPT> with a large data set, please test for acceptable performance on your slowest workstation before you start development!

How do I get the XML to the browser?

There are many ways to get the XML from the server side to the client. However, depending on your targeted browser set, some ways may be better than others. In many cases, it is best to use absolute positioning to position a <textarea> off of the screen that contains your XML text. Modern browsers make this option very easy by using cascading style sheets. Older browsers may require some tweaking.

Another option is to put your XML in a hidden <input> element.

Lastly, developers may load XML directly via JavaScript by using XML for <SCRIPT>'s server-side proxies. Please see the documentation on the XML Proxies for more information on this option.

In any case, please be sure to escape out your XML when you place it in your HTML page. While some browsers may let you do something like this:

<?xml version="1.0"?>

<NODE>

value

</NODE>

</ROOTNODE>

</textarea>

the above code is invalid HTML and can cause serious problems in some browsers, noteably Mozilla.

In many cases, these browsers will change the case of the tags (or add new ones) which will break the parser. Oddly enough, not all tags get modified. An example of one that does is <TITLE>. With unescaped XML in a textarea element, mozilla will change the case of the tag leading the DOM parser to report an error of "expected /TITLE found closing /title".

The XML code sent to the browser really should be escaped to HTML, as demonstrated below:

<?xml version="1.0"?>

<NODE>

value

</NODE>

</ROOTNODE>

</textarea>

XML for <SCRIPT> also includes a tool that will help you automate the changing of your XML into valid HTML. Rather than convert the XML to escaped HTML (which can be hard to read and bloated) it converts the <, >, and & characters into high ASCII characters that are unlikely to be in your XML stream. This tool can be found under the "Sample Code (Tools)" link on the left.

What license is XML for <SCRIPT> released under?

XML for <SCRIPT> is released under the GNU Library (Lesser) General Public License. For more informaiton, please see the COPYING file included in the distribution.

Help! Netscape 6 doesn't send my XML to the server!

Netscape 6 (and all Mozilla derivitives up to Mozilla .97) have a bug that causes them not to send form data back to the server if the form element has its CSS property "display" set to "none". In this case, it is necessary to set the "display" property to something else (i.e. "block") and then submit the form.

Instead of using "display: none" consider absolute positioning. It is cross-platform with the modern browsers and avoids this problem.

The status of this bug can be found here.

Does XML for <SCRIPT> work in Konqueror/Safari?

Konqueror versions 2.2 and higher should work just fine with XML for <SCRIPT> Classic DOM versions 1.1 and higher. Konqueror versions 3.0 are required to run the W3C DOM and SAX parsers included in XML for <SCRIPT> version 2.0

Konqueror versions prior to 2.2 seem to have problems with some of the recursive elements of XML for <SCRIPT> and some of the regular expressions used to escape illegal XML characters.

Konqueror versions prior to 3.0 had difficulties with the anonymous functions included in the SAX parser of XML for <SCRIPT> version 2.0 and the W3C DOM parser of XML for <SCRIPT> version 3.0.

Safari is based off of Konqueror (KHTML) 3.x rendering engine and has no known issues with XML for <SCRIPT>

Does XML for <SCRIPT> work in Opera?

Yes! Opera 5, 6 and 7 fully pass the XML for <SCRIPT> classic DOM test suites. Opera 6 and 7 pass the SAX test suites and Opera 7 passes the W3C DOM test suites. Other versions of Opera may work as well, but have not been fully tested. Please keep in mind that if you're targeting Opera versions prior to 7.0, some of the HTML DOM manipulation functions are different than IE, Konqueror or Mozilla. They are actually more like Netscape 4. For more information, please see the clearTestResults() and insertOptionElement() functions included with the test suites.

Is there any type of error handling?

Yes. XML for <SCRIPT> has a number of different ways to handle errors depending on which parser you use. Please see the sample applications and the test suites for the parser you are interested in for more information.

Are there any known bugs?

Yes. A listing of known bugs and the version of XML for <SCRIPT> they were fixed in (if appropriate) follows:

Classic DOM parser

If you place an > in the value of your attribute, XML for <SCRIPT> will appear to lock the browser. In fact, it is in a loop and will continue to call the error function until the browser is killed. NOTE: > characters are illegal in attribute values and should not be used. Please escape out the > character to >. This will resolve the issue.
If you forget to close your attribute value with a quote, the same behavior as the previous bug can be observed.
XML for <SCRIPT> also has a bug where the function insertNodeInto will not produce the expected results for CDATA type nodes if the data contained within the CDATA tag contains a ">" symbol.

SAX Parser

XML for <SCRIPT> 2.0 had a bug in its attribute handling as well. If an empty attribute (e.g. emptyAttribute="") followed an attribute that was not empty, the empty attribute would be assigned the value of the previous, non-empty attribute. This bug was fixed in XML for <SCRIPT> 2.1.

W3C DOM Parser

XML for <SCRIPT> 3.0 had a bug in the W3C Parser that would sometimes cause empty tags (e.g. <EMPTY_TAG />) to be ignored and their attributes unavailable. This bug was fixed in XML for <SCRIPT> 3.1.
XML for <SCRIPT> 3.0 had a bug that would cause DOMNode.importNode to sometimes fail when it should have succeeded. This bug was fixed in XML for <SCRIPT> 3.1.
XML for <SCRIPT> 3.0 had a bug where a node that had been removed from the tree did not have it's previousSibling and nextSibling set to null. This bug was fixed in XML for <SCRIPT> 3.1.
XML for <SCRIPT> 3.0 had a bug where text nodes that included entity characters (like <) in the XML source would not have respect preserveWhitespace value respected properly when preserveWhiteSpace was false. A node that looked like this:

<NODE>text < node</NODE>

in the XML source would be read into XML for <SCRIPT> like this (note the missing spaces between the words in the text node) when preserveWhiteSpace was set to false (the default):

<NODE>text<node</NODE>

This bug was fixed in XML for <SCRIPT> 3.1.

Why does the Classic DOM parser always return a new XMLDoc object whenever it manipulates the tree?

Many of XML for <SCRIPT>'s classic DOM manipulation functions are rather inefficient. An attempt was made to make all of the DOM manipulation be performed at the node level. However, JavaScript itself got in the way and prevented the code from working properly. For example, a node would be manipulated in the parser correctly, but as soon as the function returned into the application code, the node would no longer be updated correctly in either the local node object or the local XMLDoc object. It's unknown at this time whether this was a bug in the parser or a limitation of JavaScript.

Not all DOM manipulation functions had this problem. However, a consistant API was deemed more important than the performance benefits that would have been achieved with node-level manipulation. In addition, the performance of the parser is rather good even with the design employed. XML for <SCRIPT>'s dom manipulation functions have been used with success on large projects with large XML streams with adequate results.

NOTE: XML for <SCRIPT>'s W3C DOM Parser is much more efficient than the Classic DOM parser and resolves this issue.

The .js files are huge! Are there smaller ones available?

Yes! All three parsers have been "crunched" using the Creativyst^®JavaScript Compressor at creativyst.com.

These "crunched" files are available in the jsXMLParser/compressed directory and have the word "tiny" in front of their name.