Chapter 7 - Text Nodes

7.1 Text object overview

Text in an HTML document is represented by instances of the Text() constructor function, which produces text nodes. When an HTML document is parsed the text mixed in among the elements of an HTML page are converted to text nodes.

live code: http://jsfiddle.net/domenlightenment/kuz5Z

<!DOCTYPE html>
<html lang="en">
<body>

<p>hi</p>

<script>

//select "hi" text node
var textHi = document.querySelector("p").firstChild

console.log(textHi.constructor); //logs Text()

//logs Text {textContent="hi", length=2, wholeText="hi", ...}
console.log(textHi);

</script>
</body>
</html>

The code above concludes that the Text() constructor function constructs the text node but keep in mind thatText inherits from CharacterData, Node, and Object.

7.2 Text object & properties

To get accurate information pertaining to the available properties and methods on an Text node its best to ignore the specification and to ask the browser what is available. Examine the arrays created in the code below detailing the properties and methods available from a text node.

live code: http://jsfiddle.net/domenlightenment/Wj3uS

<!DOCTYPE html>
<html lang="en">
<body>

<p>hi</p>

<script>
var text = document.querySelector("p").firstChild;

//text own properties
console.log(Object.keys(text).sort());

//text own properties & inherited properties
var textPropertiesIncludeInherited = [];
for(var p in text){
	textPropertiesIncludeInherited.push(p);
}
console.log(textPropertiesIncludeInherited.sort());

//text inherited properties only
var textPropertiesOnlyInherited = [];
for(var p in text){
	if(!text.hasOwnProperty(p)){
		textPropertiesOnlyInherited.push(p);
	}
}
console.log(textPropertiesOnlyInherited.sort());

</script>
</body>
</html>

The available properties are many even if the inherited properties were not considered. Below I"ve hand pick a list of note worthy properties and methods for the context of this chapter. 

  • textContent
  • splitText()
  • appendData()
  • deleteData()
  • insertData()
  • replaceData()
  • subStringData()
  • normalize()
  • data
  • document.createTextNode() (not a property or inherited property of text nodes but discussed in this chapter)

7.3 White space creates Text nodes

When a DOM is contstructed either by the browser or by programmatic means text nodes are created from white space as well as from text characters. After all, whitespace is a character. In the code below the second paragraph, conaining an empty space, has a child Text node while the first paragraph does not.

live code: http://jsfiddle.net/domenlightenment/YbtnZ

<!DOCTYPE html>
<html lang="en">
<body>

<p id="p1"></p>
<p id="p2"> </p>

<script>

console.log(document.querySelector("#p1").firstChild) //logs null
console.log(document.querySelector("#p2").firstChild.nodeName) //logs #text

</script>
</body>
</html>

Don"t forget that white space and text characters in the DOM are typically represented by a text node. This of course means that carriage returns are considered text nodes. In the code below we log a carriage return highlighting the fact that this type of character is in fact a text node.

live code: http://jsfiddle.net/domenlightenment/9FEzq

<!DOCTYPE html>
<html lang="en">
<body>

<p id="p1"></p> //yes there is a carriage return text node before this comment, even this comment is a node
<p id="p2"></p>

<script>

console.log(document.querySelector("#p1").nextSibling) //logs Text

</script>
</body>
</html>

The reality is if you can input the character or whitespace into an html document using a keyboard then it can potentially be interputed as a text node. If you think about it, unless you minimze/compress the html document the average html page contains a great deal of whitespace and carriage return text nodes.

7.4 Creating & Injecting Text Nodes

Text nodes are created automatically for us when a browser interputs an HTML document and a corresponding DOM is built based on the contents of the document. After this fact, its also possible to programatically createText nodes using createTextNode(). In the code below I create a text node and then inject that node into the live DOM tree.

live code: http://jsfiddle.net/domenlightenment/xC9q3

<!DOCTYPE html>
<html lang="en">
<body>

<div></div>

<script>

var textNode = document.createTextNode("Hi");
document.querySelector("div").appendChild(textNode);

console.log(document.querySelector("div").innerText); // logs Hi

</script>
</body>
</html>

Keep in mind that we can also inject text nodes into programmatically created DOM structures as well. In the code below I place a text node inside of an <p> element before I inject it into the live DOM.

live code: http://jsfiddle.net/domenlightenment/PdatJ

<!DOCTYPE html>
<html lang="en">

<div></div>

<body>

<script>

var elementNode = document.createElement("p");
var textNode = document.createTextNode("Hi");
elementNode.appendChild(textNode);
document.querySelector("div").appendChild(elementNode);

console.log(document.querySelector("div").innerHTML); //logs <div>Hi</div>

</script>
</body>
</html>

7.5 Getting a Text node value with .data or nodeValue

The text value/data represented by a Text node can be extracted from the node by using the .data ornodeValue property. Both of these return the text contained in a Text node. Below I demostrate both of these to retrive the value contained in the <div>.

live code: http://jsfiddle.net/domenlightenment/dPLkx

<!DOCTYPE html>
<html lang="en">

<p>Hi, <strong>cody</strong></p><body>

<script>

console.log(document.querySelector("p").firstChild.data); //logs "Hi,"
console.log(document.querySelector("p").firstChild.nodeValue); //logs "Hi,"

</script>
</body>
</html>

Notice that the <p> contains two Text node and Element (i.e. <strong>)node. And that we are only getting the value of the first child node contained in the <p>.

Notes

Getting the length of the characters contained in a text node is as simple as accessing the length proerty of the node itself or the actual text value/data of the node (i.e. document.querySelector("p").firstChild.length ordocument.querySelector("p").firstChild.data.length ordocument.querySelector("p").firstChild.nodeValue.length)

7.6 Maniputlating Text nodes with appendData()deleteData(),insertData()replaceData()subStringData()

The CharacterData object that Text nodes inherits methods from provides the following methods for manipulating and extracting sub values from Text node values.

  • appendData()
  • deleteData()
  • insertData()
  • replaceData()
  • subStringData()

Each of these are leverage in the code example below.

live code: http://jsfiddle.net/domenlightenment/B6AC6

<!DOCTYPE html>
<html lang="en">

<p>Go big Blue Blue<body>

<script>

var pElementText = document.querySelector("p").firstChild;//add !pElementText.appendData("!");console.log(pElementText.data);//remove first "Blue"pElementText.deleteData(7,5);console.log(pElementText.data);//insert it back "Blue"pElementText.insertData(7,"Blue ");console.log(pElementText.data);//replace first "Blue" with "Bunny"pElementText.replaceData(7,5,"Bunny ");console.log(pElementText.data);//extract substring "Blue Bunny"console.log(pElementText.substringData(7,10));

</script>
</body>
</html>

Notes

These same manipulation and sub extraction methods can be leverage by Comment nodes

7.7 When mulitple sibling Text nodes occur

Typically, immediate sibling Text nodes do not occur because DOM trees created by browsers intelligently combines text nodes, however two cases exist that make sibling text nodes possible. The first case is rather obvious. If a text node contains an Element node (e.g. <p>Hi, <strong>cody</strong> welcome!</p>) than the text will be split into the proper node groupings. Its best to look at a code example as this might sound more complicted than it really is. In the code below the contents of the <p> element is not a single Text node it is in fact 3 nodes, a Text node, Element node, and another Text node.

live code: http://jsfiddle.net/domenlightenment/2ZCn3

<!DOCTYPE html>
<html lang="en">
<body>

<p>Hi, <strong>cody</strong> welcome!</p>

<script>

var pElement = document.querySelector("p");

console.log(pElement.childNodes.length); //logs 3

console.log(pElement.firstChild.data); // is text node or "Hi, "
console.log(pElement.firstChild.nextSibling); // is Element node or <strong>
console.log(pElement.lastChild.data); ​// is text node or " welcome!"

</script>
</body>
</html>

The next case occurs when we are programatically add Text nodes to an element we created in our code. In the code below I create a <p> element and then append two Text nodes to this element. Which results in sibling Textnodes.

live code: http://jsfiddle.net/domenlightenment/jk3Jn

<!DOCTYPE html>
<html lang="en">
<body>

<script>

var pElementNode = document.createElement("p");var textNodeHi = document.createTextNode("Hi ");var textNodeCody = document.createTextNode("Cody");pElementNode.appendChild(textNodeHi);pElementNode.appendChild(textNodeCody);document.querySelector("div").appendChild(pElementNode);console.log(document.querySelector("div p").childNodes.length); //logs 2​​​​​​​​​​​​​​​​​​

</script>
</body>
</html>

7.8 Remove markup and return all child Text nodes using textContent

The textContent property can be used to get all child text nodes, as well as to set the contents of a node to a specific Text node. When its used on a node to get the textual content of the node it will returned a concatenataed string of all text nodes contained with the node you call the method on. This functionality would make it very easy to extract all text nodes from an HTML document. Below I extract all of the text contained withing the <body> element. Notice that textContent gathers not just immediate child text nodes but all child text nodes no matter the depth of encapsulation inside of the node the method is called.

live code: N/A

<!DOCTYPE html>
<html lang="en">
<body>
<h1> Dude</h2>
<p>you <strong>rock!</strong></p>
<script>

console.log(document.body.textContent); //logs "Dude you rock!" with some added white space

</script>
</body>
</html>

When textContent is used to set the text contained within a node it will remove all child nodes first, replacing them with a single Text node. In the code below I replace all the nodes inside of the <div> element with a singleText node.

live code: http://jsfiddle.net/domenlightenment/m766T

<!DOCTYPE html>
<html lang="en">
<body>
<div>
<h1> Dude</h2>
<p>you <strong>rock!</strong></p>
</div>
<script>

document.body.textContent = "You don"t rock!"
console.log(document.querySelector("div").textContent); //logs "You don"t rock!"

</script>
</body>
</html>

Notes

textContent returns null if used on the a document or doctype node.

textContent returns the contents from <script> and <style> elements

7.9 The difference between textContent & innerText

Most of the modern bowser, except Firefox, support a seeminly similiar property to textContent namedinnerText. However these properties are not the same. You should be aware of the following differences between textContent & innerText.

  • innerText is aware of CSS. So if you have hidden text innerText ignores this text, whereas textContentwill not
  • Because innerText cares about CSS it will trigger a reflow, whereas textContent will not
  • innerText ignores the Text nodes contained in <script> and <style> elements
  • innerText, unlike textContent will normalize the text that is returned. Just think of textContent as returning exactly what is in the document with the markup removed. This will include white space, line breaks, and carriage returns
  • innerText is considered to be non-standard and browser specific while textContent is implemented from the DOM specifications

If you you intend to use innerText you"ll have to create a work around for Firefox.

7.10 Combine sibling Text nodes into one text node using normalize()

Sibling Text nodes are typically only encountered when text is programaticly added to the DOM. To eliminate sibling Text nodes that contain no Element nodes we can use normalize(). This will concatenate sibling text nodes in the DOM into a single Text node. In the code below I create sibling text, append it to the DOM, then normalize it.

live code: http://jsfiddle.net/domenlightenment/LG9WR

<!DOCTYPE html>
<html lang="en">
<body>
<div></div>
<script>

var pElementNode = document.createElement("p");
var textNodeHi = document.createTextNode("Hi");
var textNodeCody = document.createTextNode("Cody");

pElementNode.appendChild(textNodeHi);
pElementNode.appendChild(textNodeCody);

document.querySelector("div").appendChild(pElementNode);

console.log(document.querySelector("p").childNodes.length); //logs 2

document.querySelector("div").normalize(); //combine our sibling text nodes

console.log(document.querySelector("p").childNodes.length); //logs 1

</script>
</body>
</html>

7.11 Splitting a text node using splitText()

When splitText() is called on a Text node it will alter the text node its being called on (leaving the text up to the offset) and return a new Text node that contains the text split off from the orginal text based on the offset. In the code below the text node Hey Yo! is split after Hey and Hey is left in the DOM while Yo! is turned into a new text node are returned by the splitText() method.

live code: http://jsfiddle.net/domenlightenment/Tz5ce

<!DOCTYPE html>
<html lang="en">
<body>

<p>Hey Yo!</p>

<script>

//returns a new text node, taken from the DOM
console.log(document.querySelector("p").firstChild.splitText(4).data); //logs Yo!

//What remains in the DOM...console.log(document.querySelector("p").firstChild.textContent); //logs Hey

</script>
</body>
</html>
文章导航