Welcome to Software Development on Codidact!
Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.
Understanding createTreeWalker method in the context of replacing strings (or parts of them)
I want to ensure I understand the following code; credit to user:m3g4p0p on Sitepoint, for this code:
const walker = document.createTreeWalker(
document.body,
NodeFilter.SHOW_TEXT
)
let node;
while ((node = walker.nextNode())) {
node.textContent = node.textContent.replace('a', 'b')
}
Exceptionally I have two questions:
- How do you understand each stage of the code?
- How does storing replaced strings in the
node
variable makes a change in the text appearing to the end user?
1 answer
How does storing replaced strings in the
node
variable makes a change in the text appearing to the end user?
In your case, you're changing the textContent
property. When accessed, it returns the text content of a node, concatenated with the text content of its descendants, and changing its value will change the element's contents.
For example, let's suppose I have this HTML:
<p id="content">
A man <span>walks <b>into a <a href="#">bar</a></b> and gets a beer</span>
</p>
This is displayed as:
A man walks into a bar and gets a beer
The same as an image, rendered in my browser (Chrome):
Note that inside the paragraph there are other tags (span
, b
and a
). But if I get the paragraph's textContent
, only the text is returned:
const p = document.querySelector('#content');
console.log(p.textContent); // A man walks into a bar and gets a beer
Because textContent
is a string that contains only the text content of the paragraph and its descendants. It doesn't return any child tags that the element might have, only their text contents.
Being a string, you can manipulate it as you'd do with any other string. So calling replace
returns another string with the result:
console.log(p.textContent.replace('a', 'b'));
This prints A mbn walks into a bar and gets a beer
(because replace('a', 'b')
only replaces the first ocurrence of "a" for "b").
But note that it doesn't change the paragraph, because replace
returns another string, leaving the original untouched. Only if you set this another string to textContent
, the DOM is updated:
// changes DOM, page is updated
p.textContent = p.textContent.replace('a', 'b');
When you set textContent
, the paragraph's contents are changed to the modified text. One important detail is that all the paragraph's descendant nodes (span
, b
and a
) are removed, and now the paragraph contains only the text returned by replace
:
A mbn walks into a bar and gets a beer
Anyway, when you set an element's textContent
to some text, the element will be changed to contain only that text, and all the child tags that the element had will be removed.
Regarding createTreeWalker
, we can check in the docs that it creates a TreeWalker
, which is an object that represents a document subtree (the document
is a tree that contains all the page's elements; think of TreeWalker
as a subset of it: a subtree that contains only some elements).
The first argument is the starting point, the element from where you'll start searching for the others: in your case you used document.body
, so it'll return all elements inside document.body
that satisfies the criteria.
And the criteria is determined by the second argument. In your case, you used NodeFilter.SHOW_TEXT
, which tells the function to return only text nodes. Therefore, the result is a TreeWalker
that contains only the text nodes inside document.body
(which is basically "all text nodes of the document").
To understand what text nodes are, let's consider the same HTML:
<p id="content">
A man <span>walks <b>into a <a href="#">bar</a></b> and gets a beer</span>
</p>
If I get all the text nodes from this paragraph:
const p = document.querySelector('#content');
const walker = document.createTreeWalker(
p, // ** Getting only for p, instead of whole document.body **
NodeFilter.SHOW_TEXT
);
let node;
while ((node = walker.nextNode())) {
console.log(`node=${node.textContent}`);
}
It'll print:
node=
A man
node=walks
node=into a
node=bar
node= and gets a beer
node=
Note that each "independent" chunk of text is a separate text node (including line breaks between tags). "A man" is a text between the opening tag <p>
and the opening tag <span>
, "walks" is the text between <span>
and <b>
and so on.
One important detail is that text nodes don't have child nodes, so changing their textContent
doesn't cause the problem we saw above (it won't remove any child nodes, because there aren't any).
Hence, if you run your code:
while ((node = walker.nextNode())) {
node.textContent = node.textContent.replace('a', 'b');
}
It will perform the replace in all text nodes and preserve the paragraph's child nodes. The result will be:
A mbn wblks into b bbr bnd gets a beer
So your code is basically getting all text nodes in the document, and for each of those nodes, it's replacing the first "a" for "b". I ran the code in Codidact's page and this was the result:
As a final note, replace('a', 'b')
changes only the first ocurrence of "a". If you want to change all ocurrences, you could use replaceAll('a', 'b')
or replace(/a/g, 'b')
.
3 comment threads