XML MATTERS #41: Beyond the DOM Tips and tricks for a friendlier DOM Dethe Elza Senior Technical Architect, Blast Radius March 2005 The Document Object Model (DOM) is one of the most widely implemented tools for manipulating XML and HTML data, but it is rarely used to its full potential. By leveraging the DOM and extending it to be even easier to use we gain a powerful tool for XML applications, including dynamic web applications. This installment introduces a guest columnist, my friend and colleague Dethe Elza. Dethe is well experienced in the development of web applications utilizing XML, and I appreciate his help in covering XML programming with DOM and ECMAScript. Keep an eye on this column for future guest installments by Dethe (David Mertz). INTRODUCTION ------------------------------------------------------------------------ The Document Object Model is one of the standard APIs for working with XML and HTML. It often gets criticized for using too much memory, being too slow, and/or being too verbose. For many applications, however, it is the right way to go, certainly much simpler than SAX, the other major APIs for XML. The DOM is increasingly exposed in tools: Web browsers, SVG browsers, OpenOffice, and others. The DOM is good because it is standard and widely implemented, built into other standards. As a standard, it works the same way regardless of the programming language you use (this can be good or bad, but at least it is consistent). The DOM is built into more than web browsers now, and is a part of many XML-based specifications. Since it is already a part of your tools, and you're using it right now, maybe it's time to get comfortable with the DOM. After using the DOM for awhile some patterns emerge, there are things you want to be able to do over and over. There are shortcuts to help work around the verbosity of the DOM, making code which is self-explanatory and elegant. Here is a collection of some of my most-used tips and tricks, with examples in Javascript. TIPS AND TRICKS ------------------------------------------------------------------------ For the first trick, there is no trick. The DOM has two methods to add child nodes to a container node (usually an 'Element', but could be a 'Document' or 'DocumentFragment'): 'appendChild(node)' and 'insertBefore(node, referenceNode)'. But something seems to be missing. What if I want to insert after a reference node, or prepend a child node (make the new node first in the list)? For years I wrote utility functions like #-------------- Wrong way to insert and prepend -----------------# function insertAfter(parent, node, referenceNode) { if(referenceNode.nextSibling) { parent.insertBefore(node, referenceNode.nextSibling); } else { parent.appendChild(node); } } function prependChild(parent, node) { if (parent.firstChild) { parent.insertBefore(node, parent.firstChild); } else { parent.appendChild(node); } } As it turns out, the 'insertBefore()' function is already defined to fall back to 'appendChild()' if the reference node is null, so instead of using the above you can either use these one-liners, or skip them altogether and just use the built-in functions. #------------- Right way to insert and prepend -----------------# function insertAfter(parent, node, referenceNode) { parent.insertBefore(node, referenceNode.nextSibling); } function prependChild(parent, node) { parent.insertBefore(node, parent.firstChild); } If you are new to DOM programming it is worth pointing out that while you can have several pointers to a node in your programming language of choice, the node can only be in the DOM tree in one place. So if you are inserting it into the tree, you don't have to remove it from the tree first, that will happen automatically. This is handy when you want to re-order nodes--you can just insert them into the new positions. Given the above, if you have two adjacent nodes (call them 'node1' and 'node2') and want to transpose them, you could use either of the following: #----------------------- Transpose Nodes ------------------------# node1.parentNode.insertBefore(node2, node1); // or node1.parentNode.insertBefore(node1.nextSibling, node1); WHAT ELSE CAN YOU DO WITH THE DOM? ------------------------------------------------------------------------ There are as many uses of the DOM as there are web pages. If you visit the bookmarklets sites (see Resources) you can find several short scripts which make innovative use of the DOM to reformat pages, extract links, hide images or Flash advertisements, among other things. But first things first. Since Internet Explorer does not define the Node interface constants, which let us easily identify the type of node, one of the first things to do in a DOM script for the web is to make sure we define one ourselves if it is missing. #--------------------- Ensure Node is Defined -------------------# if (!window['Node']) { window.Node = new Object(); Node.ELEMENT_NODE = 1; Node.ATTRIBUTE_NODE = 2; Node.TEXT_NODE = 3; Node.CDATA_SECTION_NODE = 4; Node.ENTITYE_REFERENCE_NODE = 5; Node.ENTITY_NODE = 6; Node.PROCESSING_INSTRUCTION_NODE = 7; Node.COMMENT_NODE = 8; Node.DOCUMENT_NODE = 9; Node.DOCUMENT_TYPE_NODE = 10; Node.DOCUMENT_FRAGMENT_NODE = 11; Node.NOTATION_NODE = 12; } Here is an example to extract all the text nodes contained in a node: #------------------------- Inner Text ---------------------------# function innerText(node) { // is this a text or CDATA node? if (node.nodeType == 3 || node.nodeType == 4) { return node.data; } var i; var returnValue = []; for (i = 0; i < node.childNodes.length; i++) { returnValue.push(innerText(node.childNodes[i])); } return returnValue.join(''); } SHORTCUTS ------------------------------------------------------------------------ A common complaint about the DOM is that it is too verbose and requires too much typing to do simple things. For instance, if you want to create a '