Bookmark and Share

Introduction to the Document Object Model

Review the W3C DOM2 standard recommendation.

The Document Root

The document object serves as the root of this node tree. It too implements the Node interface. It will have child nodes but no parent node or sibling nodes, as it is the starting node. In addition to being a Node, it also implements the Document interface.

This interface provides methods for accessing and creating other nodes in the document tree. Some methods are:

Note that unlike other nodes, there is only one document object in a page. All of the above methods (except getElementsByTagName()) can only be used against the document object, i.e., using the syntax document.methodName().

The document object can also have several other properties related to older DOM level support. For example, you'll find that many browsers still have document.images and document.links arrays or document.bgColor and document.fgColor properties relating to the BGCOLOR and TEXT attributes of the BODY tag.

These properties are intended to provide some backward compatibility so that pages designed for older browsers will still work properly with newer versions. They can still be used in scripts but they may not be supported in future versions.

Traversing the Document Tree

As mentioned, the document tree reflects the structure of a page's HTML code. Every tag or tag pair is represented by a element node with other nodes representing attributes or character data (i.e., text).

Technically speaking, the document object has only one child element, given by document.documentElement. For web pages, this represents the outer HTML tag and it acts as the root element of the document tree. It will have child elements for HEAD and BODY tags which in turn will have other child elements.

Using this, and the methods of the Node interface, you can traverse the document tree to access individual node within the tree. Consider the following:

<body><p>This is a sample paragraph.</p></body>

and this code:


which would display "P", the name of the tag represented by that node. The code breaks down as follows:

There are some obvious problems with accessing nodes like this. For one, a simple change to the page source, like adding text or formatting elements or images, will change the tree structure. The path used before may no longer point to the intended node.

Less obvious are some browser compatibility issues. Note that in the sample HTML above, there is no spacing between the BODY tag and the P tag. If some simple line breaks are added,


<p>This is a sample paragraph.</p>

Try's DOM Viewer utility to interactively navigate a document tree.

Netscape will add a node for this data, while IE will not. So in Netscape, the JavaScript code shown above would display "undefined" as it now points to the text node for this white space. Since it's not an element node, it has no tag name. IE, on the other hand, does not add nodes for white space like this, so it would still point to the P tag.

Accessing Elements Directly

This is where the document.getElementById() method comes in handy. By adding an ID attribute to the paragraph tag (or any tag for that matter), you can reference it directly.

<p id="myParagraph">This is a sample paragraph.</p>



This way, you can avoid compatibility issues and update the page contents at will without worrying about where the node for the paragraph tag is in the document tree. Just remember that each ID needs to be unique to the page.

A less direct method to access element nodes is provided by document.getElementsByTagName(). This returns an array of nodes representing all of the elements on a page with the specified HTML tag. For example, you could change color of every link on a page with the following.

var nodeList = document.getElementsByTagName("A");
for (var i = 0; i < nodeList.length; i++)
  nodeList[i].style.color = "#ff0000";

Which simply updates each link's inline style to set the color parameter to red. Give it a try.