Simple html dom find all

PHP Simple HTML DOM Parser Manual

// Find all article blocks
foreach($html->find( ‘div.article’ ) as $article) $item[ ‘title’ ] = $article->find( ‘div.title’ , 0 )->plaintext;
$item[ ‘intro’ ] = $article->find( ‘div.intro’ , 0 )->plaintext;
$item[ ‘details’ ] = $article->find( ‘div.details’ , 0 )->plaintext;
$articles[] = $item;
>

How to create HTML DOM object?

// Create a DOM object from a string
$html = str_get_html( ‘Hello!‘ );

// Create a DOM object from a URL
$html = file_get_html( ‘http://www.google.com/’ );

// Create a DOM object from a HTML file
$html = file_get_html( ‘test.htm’ );

// Create a DOM object
$html = new simple_html_dom();

// Load HTML from a string
$html->load( ‘Hello!‘ );

// Load HTML from a URL
$html->load_file( ‘http://www.google.com/’ );

// Load HTML from a HTML file
$html->load_file( ‘test.htm’ );

How to find HTML elements?

// Find all anchors, returns a array of element objects
$ret = $html->find( ‘a‘ );

// Find (N)th anchor, returns element object or null if not found (zero based)
$ret = $html->find( ‘a‘, 0 );

// Find lastest anchor, returns element object or null if not found (zero based)
$ret = $html->find( ‘a‘, -1 );

// Find all with the id attribute
$ret = $html->find( ‘div[id]‘ );

// Find all which attribute id=foo
$ret = $html->find( ‘div[id=foo]‘ );

// Find all element which id=foo
$ret = $html->find( ‘#foo‘ );

// Find all element which class=foo
$ret = $html->find( ‘.foo‘ );

// Find all element has attribute id
$ret = $html->find( ‘*[id]‘ );

// Find all anchors and images
$ret = $html->find( ‘a, img‘ );

// Find all anchors and images with the «title» attribute
$ret = $html->find( ‘a[title], img[title]‘ );

Supports these operators in attribute selectors:

Filter Description
[attribute] Matches elements that have the specified attribute.
[!attribute] Matches elements that don’t have the specified attribute.
[attribute=value] Matches elements that have the specified attribute with a certain value.
[attribute!=value] Matches elements that don’t have the specified attribute with a certain value.
[attribute^=value] Matches elements that have the specified attribute and it starts with a certain value.
[attribute$=value] Matches elements that have the specified attribute and it ends with a certain value.
[attribute*=value] Matches elements that have the specified attribute and it contains a certain value.


    $es = $html->find( ‘ul li‘ );

// Find Nested tags
$es = $html->find( ‘div div div‘ );

// Find all td tags with attribite align=center in table tags
$es = $html->find( »table td[align=center]‘ );

// Find all text blocks
$es = $html->find( ‘text‘ );

// Find all comment () blocks
$es = $html->find( ‘comment‘ );


    foreach($html->find( ‘ul‘ ) as $ul)
    foreach($ul->find( ‘li‘ ) as $li)
    // do something.
    >
    >

How to access the HTML element’s attributes?

// Get a attribute ( If the attribute is non-value attribute (eg. checked, selected. ), it will returns true or false)
$value = $e->href;

// Set a attribute(If the attribute is non-value attribute (eg. checked, selected. ), set it’s value as true or false)
$e->href = ‘my link’ ;

// Remove a attribute, set it’s value as null!
$e->href = null ;

// Determine whether a attribute exist?
if(isset($e->href))
echo ‘href exist!’ ;

// Example
$ html = str_get_html ( «

foo bar

» ) ;
$e = $html->find( «div» , 0 );

echo $e->tag; // Returns: » div»
echo $e->outertext; // Returns: »

foo bar

»
echo $e->innertext; // Returns: » foo bar»
echo $e->plaintext; // Returns: » foo bar«

Attribute Name Usage
$e->tag Read or write the tag name of element.
$e->outertext Read or write the outer HTML text of element.
$e->innertext Read or write the inner HTML text of element.
$e->plaintext Read or write the plain text of element.

// Extract contents from HTML
echo $html->plaintext;

// Wrap a element
$e->outertext = » . $e->outertext . ‘ ‘;

// Remove a element, set it’s outertext as an empty string
$e->outertext = » ;

// Append a element
$e->outertext = $e->outertext . ‘foo ‘;

// Insert a element
$e->outertext = ‘foo ‘ . $e->outertext;

How to traverse the DOM tree?

// If you are not so familiar with HTML DOM, check this link to learn more.

// Example
echo $html->find( «#div1», 0 )->children( 1 )->children( 1 )->children( 2 )-> id ;
// or
echo $html->getElementById( «div1» )->childNodes( 1 )->childNodes( 1 )->childNodes( 2 )->getAttribute( ‘id’ );

Источник

Parsing documents

The parser accepts documents in the form of URLs, files and strings. The document must be accessible for reading and cannot exceed MAX_FILE_SIZE .

Name Description
str_get_html( string $content ) : object Creates a DOM object from string.
file_get_html( string $filename ) : object Creates a DOM object from file or URL.

DOM methods & properties

Name Description
__construct( [string $filename] ) : void Constructor, set the filename parameter will automatically load the contents, either text or file/url.
plaintext : string Returns the contents extracted from HTML.
clear() : void Clean up memory.
load( string $content ) : void Load contents from string.
save( [string $filename] ) : string Dumps the internal DOM tree back into a string. If the $filename is set, result string will save to file.
load_file( string $filename ) : void Load contents from a file or a URL.
set_callback( string $function_name ) : void Set a callback function.
find( string $selector [, int $index] ) : mixed Find elements by the CSS selector. Returns the Nth element object if index is set, otherwise return an array of object.

Element methods & properties

Name Description
[attribute] : string Read or write element’s attribute value.
tag : string Read or write the tag name of element.
outertext : string Read or write the outer HTML text of element.
innertext : string Read or write the inner HTML text of element.
plaintext : string Read or write the plain text of element.
find( string $selector [, int $index] ) : mixed Find children by the CSS selector. Returns the Nth element object if index is set, otherwise return an array of object.

DOM traversing

Name Description
$e->children( [int $index] ) : mixed Returns the Nth child object if index is set, otherwise return an array of children.
$e->parent() : element Returns the parent of element.
$e->first_child() : element Returns the first child of element, or null if not found.
$e->last_child() : element Returns the last child of element, or null if not found.
$e->next_sibling() : element Returns the next sibling of element, or null if not found.
$e->prev_sibling() : element Returns the previous sibling of element, or null if not found.

Camel naming conventions

Method Mapping
$e->getAllAttributes() $e->attr
$e->getAttribute( $name ) $e->attribute
$e->setAttribute( $name, $value) $value = $e->attribute
$e->hasAttribute( $name ) isset($e->attribute)
$e->removeAttribute ( $name ) $e->attribute = null
$e->getElementById ( $id ) $e->find ( «#$id», 0 )
$e->getElementsById ( $id [,$index] ) $e->find ( «#$id» [, int $index] )
$e->getElementByTagName ($name ) $e->find ( $name, 0 )
$e->getElementsByTagName ( $name [, $index] ) $e->find ( $name [, int $index] )
$e->parentNode () $e->parent ()
$e->childNodes ( [$index] ) $e->children ( [int $index] )
$e->firstChild () $e->first_child ()
$e->lastChild () $e->last_child ()
$e->nextSibling () $e->next_sibling ()
$e->previousSibling () $e->prev_sibling ()

Источник

PHP Simple HTML DOM Parser Manual

// Find all article blocks
foreach($html->find( ‘div.article’ ) as $article) $item[ ‘title’ ] = $article->find( ‘div.title’ , 0 )->plaintext;
$item[ ‘intro’ ] = $article->find( ‘div.intro’ , 0 )->plaintext;
$item[ ‘details’ ] = $article->find( ‘div.details’ , 0 )->plaintext;
$articles[] = $item;
>

How to create HTML DOM object?

// Create a DOM object from a string
$html = str_get_html( ‘Hello!‘ );

// Create a DOM object from a URL
$html = file_get_html( ‘http://www.google.com/’ );

// Create a DOM object from a HTML file
$html = file_get_html( ‘test.htm’ );

// Create a DOM object
$html = new simple_html_dom();

// Load HTML from a string
$html->load( ‘Hello!‘ );

// Load HTML from a URL
$html->load_file( ‘http://www.google.com/’ );

// Load HTML from a HTML file
$html->load_file( ‘test.htm’ );

How to find HTML elements?

// Find all anchors, returns a array of element objects
$ret = $html->find( ‘a‘ );

// Find (N)th anchor, returns element object or null if not found (zero based)
$ret = $html->find( ‘a‘, 0 );

// Find lastest anchor, returns element object or null if not found (zero based)
$ret = $html->find( ‘a‘, -1 );

// Find all with the id attribute
$ret = $html->find( ‘div[id]‘ );

// Find all which attribute id=foo
$ret = $html->find( ‘div[id=foo]‘ );

// Find all element which id=foo
$ret = $html->find( ‘#foo‘ );

// Find all element which class=foo
$ret = $html->find( ‘.foo‘ );

// Find all element has attribute id
$ret = $html->find( ‘*[id]‘ );

// Find all anchors and images
$ret = $html->find( ‘a, img‘ );

// Find all anchors and images with the «title» attribute
$ret = $html->find( ‘a[title], img[title]‘ );

Supports these operators in attribute selectors:

Filter Description
[attribute] Matches elements that have the specified attribute.
[!attribute] Matches elements that don’t have the specified attribute.
[attribute=value] Matches elements that have the specified attribute with a certain value.
[attribute!=value] Matches elements that don’t have the specified attribute with a certain value.
[attribute^=value] Matches elements that have the specified attribute and it starts with a certain value.
[attribute$=value] Matches elements that have the specified attribute and it ends with a certain value.
[attribute*=value] Matches elements that have the specified attribute and it contains a certain value.


    $es = $html->find( ‘ul li‘ );

// Find Nested tags
$es = $html->find( ‘div div div‘ );

// Find all td tags with attribite align=center in table tags
$es = $html->find( »table td[align=center]‘ );

// Find all text blocks
$es = $html->find( ‘text‘ );

// Find all comment () blocks
$es = $html->find( ‘comment‘ );


    foreach($html->find( ‘ul‘ ) as $ul)
    foreach($ul->find( ‘li‘ ) as $li)
    // do something.
    >
    >

How to access the HTML element’s attributes?

// Get a attribute ( If the attribute is non-value attribute (eg. checked, selected. ), it will returns true or false)
$value = $e->href;

// Set a attribute(If the attribute is non-value attribute (eg. checked, selected. ), set it’s value as true or false)
$e->href = ‘my link’ ;

// Remove a attribute, set it’s value as null!
$e->href = null ;

// Determine whether a attribute exist?
if(isset($e->href))
echo ‘href exist!’ ;

// Example
$ html = str_get_html ( «

foo bar

» ) ;
$e = $html->find( «div» , 0 );

echo $e->tag; // Returns: » div»
echo $e->outertext; // Returns: »

foo bar

»
echo $e->innertext; // Returns: » foo bar»
echo $e->plaintext; // Returns: » foo bar«

Attribute Name Usage
$e->tag Read or write the tag name of element.
$e->outertext Read or write the outer HTML text of element.
$e->innertext Read or write the inner HTML text of element.
$e->plaintext Read or write the plain text of element.

// Extract contents from HTML
echo $html->plaintext;

// Wrap a element
$e->outertext = » . $e->outertext . ‘ ‘;

// Remove a element, set it’s outertext as an empty string
$e->outertext = » ;

// Append a element
$e->outertext = $e->outertext . ‘foo ‘;

// Insert a element
$e->outertext = ‘foo ‘ . $e->outertext;

How to traverse the DOM tree?

// If you are not so familiar with HTML DOM, check this link to learn more.

// Example
echo $html->find( «#div1», 0 )->children( 1 )->children( 1 )->children( 2 )-> id ;
// or
echo $html->getElementById( «div1» )->childNodes( 1 )->childNodes( 1 )->childNodes( 2 )->getAttribute( ‘id’ );

Источник

Читайте также:  Php массивы поиск ключей
Оцените статью