Free Perl Tutorials

HTML::TreeBuilder::Scanning Tutorial - look_down, as_text, as_HTML, parse_file, delete

This is an HTML::TreeBuilder::Scanning Tutorial, mainly for myself, because I am still learning too! The good thing is I'll use normal everyday language to describe what I am doing, being a NOOB myself!

First I am going to describe some of the objects we'll be using from the HTML::TreeBuilder Module. If you don't know what objects are, click here for a brief tutorial on objects classes and packages in PERL.

To return an object (which is the information that we are looking for), we need to know how to use some "object method calling" to give us the object we want returned. Here is a description of a few object method calling tools:

parse_file
This method creates a tree based on source from the file we will refer it to.

delete
This method deletes a tree making it ready for GC (thats techie talk for "Garbage Collection", which is an automated process in PERL).

as_text
This method returns a string that contains all bits that are children of a given section of the file we wish to look through (such as in between the body tags in html.

as_HTML
This method returns all the html tags ang text within specific tags.

look_down
This method is probably the most important method we use when trying to find specific information in a file when using HTML::TreeBuilder. This method looks down at a specific level within our tree starting at a specific object. It then looks for the criteria you provide. The criteria you specify are set in the look_down argument list. Each set of criteria can contain two scalar values (scalars are simply names or numbers). This set of 2 scalar values consist of a key and a value. A key and a value would be like looking for a tag in HTML. For instance, your key would be "_tag" and your value would be "h1".

Here is the simple usage:

$h1->look_down('_tag', 'h1');