Simple html dom find all

Содержание

PHP Simple HTML DOM Parser Manual
How to create HTML DOM object?
How to find HTML elements?
How to access the HTML element’s attributes?
How to traverse the DOM tree?
Parsing documents
DOM methods & properties
Element methods & properties
DOM traversing
Camel naming conventions
PHP Simple HTML DOM Parser Manual
How to create HTML DOM object?
How to find HTML elements?
How to access the HTML element’s attributes?
How to traverse the DOM tree?

PHP Simple HTML DOM Parser Manual

// Find all article blocks
foreach($html->find( ‘div.article’ ) as $article) $item[ ‘title’ ] = $article->find( ‘div.title’ , 0 )->plaintext;
$item[ ‘intro’ ] = $article->find( ‘div.intro’ , 0 )->plaintext;
$item[ ‘details’ ] = $article->find( ‘div.details’ , 0 )->plaintext;
$articles[] = $item;
>

How to create HTML DOM object?

// Create a DOM object from a string
$html = str_get_html( ‘Hello!‘ );

// Create a DOM object from a URL
$html = file_get_html( ‘http://www.google.com/’ );

// Create a DOM object from a HTML file
$html = file_get_html( ‘test.htm’ );

// Create a DOM object
$html = new simple_html_dom();

// Load HTML from a string
$html->load( ‘Hello!‘ );

// Load HTML from a URL
$html->load_file( ‘http://www.google.com/’ );

// Load HTML from a HTML file
$html->load_file( ‘test.htm’ );

How to find HTML elements?

// Find all anchors, returns a array of element objects
$ret = $html->find( ‘a‘ );

// Find (N)th anchor, returns element object or null if not found (zero based)
$ret = $html->find( ‘a‘, 0 );

// Find lastest anchor, returns element object or null if not found (zero based)
$ret = $html->find( ‘a‘, -1 );

// Find all with the id attribute
$ret = $html->find( ‘div[id]‘ );

// Find all which attribute id=foo
$ret = $html->find( ‘div[id=foo]‘ );

// Find all element which id=foo
$ret = $html->find( ‘#foo‘ );

// Find all element which class=foo
$ret = $html->find( ‘.foo‘ );

// Find all element has attribute id
$ret = $html->find( ‘*[id]‘ );

// Find all anchors and images
$ret = $html->find( ‘a, img‘ );

// Find all anchors and images with the «title» attribute
$ret = $html->find( ‘a[title], img[title]‘ );

Supports these operators in attribute selectors:

Filter Description

[attribute] Matches elements that have the specified attribute.

[!attribute] Matches elements that don’t have the specified attribute.

[attribute=value] Matches elements that have the specified attribute with a certain value.

[attribute!=value] Matches elements that don’t have the specified attribute with a certain value.

[attribute^=value] Matches elements that have the specified attribute and it starts with a certain value.

[attribute$=value] Matches elements that have the specified attribute and it ends with a certain value.

[attribute*=value] Matches elements that have the specified attribute and it contains a certain value.

$es = $html->find( ‘ul li‘ );

// Find Nested tags
$es = $html->find( ‘div div div‘ );

// Find all td tags with attribite align=center in table tags
$es = $html->find( »table td[align=center]‘ );

// Find all text blocks
$es = $html->find( ‘text‘ );

// Find all comment () blocks
$es = $html->find( ‘comment‘ );

foreach($html->find( ‘ul‘ ) as $ul)
foreach($ul->find( ‘li‘ ) as $li)
// do something.
>
>

How to access the HTML element’s attributes?

// Get a attribute ( If the attribute is non-value attribute (eg. checked, selected. ), it will returns true or false)
$value = $e->href;

// Set a attribute(If the attribute is non-value attribute (eg. checked, selected. ), set it’s value as true or false)
$e->href = ‘my link’ ;

// Remove a attribute, set it’s value as null!
$e->href = null ;

// Determine whether a attribute exist?
if(isset($e->href))
echo ‘href exist!’ ;

// Example
$ html = str_get_html ( «

foo bar

» ) ;
$e = $html->find( «div» , 0 );

echo $e->tag; // Returns: » div»
echo $e->outertext; // Returns: »

foo bar

»
echo $e->innertext; // Returns: » foo bar»
echo $e->plaintext; // Returns: » foo bar«

Attribute Name Usage

$e->tag Read or write the tag name of element.

$e->outertext Read or write the outer HTML text of element.

$e->innertext Read or write the inner HTML text of element.

$e->plaintext Read or write the plain text of element.

// Extract contents from HTML
echo $html->plaintext;

// Wrap a element
$e->outertext = » . $e->outertext . ‘ ‘;

// Remove a element, set it’s outertext as an empty string
$e->outertext = » ;

// Append a element
$e->outertext = $e->outertext . ‘foo ‘;

// Insert a element
$e->outertext = ‘foo ‘ . $e->outertext;

How to traverse the DOM tree?

// If you are not so familiar with HTML DOM, check this link to learn more.

// Example
echo $html->find( «#div1», 0 )->children( 1 )->children( 1 )->children( 2 )-> id ;
// or
echo $html->getElementById( «div1» )->childNodes( 1 )->childNodes( 1 )->childNodes( 2 )->getAttribute( ‘id’ );

Источник

Parsing documents

The parser accepts documents in the form of URLs, files and strings. The document must be accessible for reading and cannot exceed MAX_FILE_SIZE .

Name Description

str_get_html( string $content ) : object Creates a DOM object from string.

file_get_html( string $filename ) : object Creates a DOM object from file or URL.

DOM methods & properties

Name Description

__construct( [string $filename] ) : void Constructor, set the filename parameter will automatically load the contents, either text or file/url.

plaintext : string Returns the contents extracted from HTML.

clear() : void Clean up memory.

load( string $content ) : void Load contents from string.

save( [string $filename] ) : string Dumps the internal DOM tree back into a string. If the $filename is set, result string will save to file.

load_file( string $filename ) : void Load contents from a file or a URL.

set_callback( string $function_name ) : void Set a callback function.

find( string $selector [, int $index] ) : mixed Find elements by the CSS selector. Returns the Nth element object if index is set, otherwise return an array of object.

Element methods & properties

Name Description

[attribute] : string Read or write element’s attribute value.

tag : string Read or write the tag name of element.

outertext : string Read or write the outer HTML text of element.

innertext : string Read or write the inner HTML text of element.

plaintext : string Read or write the plain text of element.

find( string $selector [, int $index] ) : mixed Find children by the CSS selector. Returns the Nth element object if index is set, otherwise return an array of object.

DOM traversing

Name Description

$e->children( [int $index] ) : mixed Returns the Nth child object if index is set, otherwise return an array of children.

$e->parent() : element Returns the parent of element.

$e->first_child() : element Returns the first child of element, or null if not found.

$e->last_child() : element Returns the last child of element, or null if not found.

$e->next_sibling() : element Returns the next sibling of element, or null if not found.

$e->prev_sibling() : element Returns the previous sibling of element, or null if not found.

Camel naming conventions

Method Mapping

$e->getAllAttributes() $e->attr

$e->getAttribute( $name ) $e->attribute

$e->setAttribute( $name, $value) $value = $e->attribute

$e->hasAttribute( $name ) isset($e->attribute)

$e->removeAttribute ( $name ) $e->attribute = null

$e->getElementById ( $id ) $e->find ( «#$id», 0 )

$e->getElementsById ( $id [,$index] ) $e->find ( «#$id» [, int $index] )

$e->getElementByTagName ($name ) $e->find ( $name, 0 )

$e->getElementsByTagName ( $name [, $index] ) $e->find ( $name [, int $index] )

$e->parentNode () $e->parent ()

$e->childNodes ( [$index] ) $e->children ( [int $index] )

$e->firstChild () $e->first_child ()

$e->lastChild () $e->last_child ()

$e->nextSibling () $e->next_sibling ()

$e->previousSibling () $e->prev_sibling ()

Источник

PHP Simple HTML DOM Parser Manual

// Find all article blocks
foreach($html->find( ‘div.article’ ) as $article) $item[ ‘title’ ] = $article->find( ‘div.title’ , 0 )->plaintext;
$item[ ‘intro’ ] = $article->find( ‘div.intro’ , 0 )->plaintext;
$item[ ‘details’ ] = $article->find( ‘div.details’ , 0 )->plaintext;
$articles[] = $item;
>

How to create HTML DOM object?

// Create a DOM object from a string
$html = str_get_html( ‘Hello!‘ );

// Create a DOM object from a URL
$html = file_get_html( ‘http://www.google.com/’ );

// Create a DOM object from a HTML file
$html = file_get_html( ‘test.htm’ );

// Create a DOM object
$html = new simple_html_dom();

// Load HTML from a string
$html->load( ‘Hello!‘ );

// Load HTML from a URL
$html->load_file( ‘http://www.google.com/’ );

// Load HTML from a HTML file
$html->load_file( ‘test.htm’ );

How to find HTML elements?

// Find all anchors, returns a array of element objects
$ret = $html->find( ‘a‘ );

// Find (N)th anchor, returns element object or null if not found (zero based)
$ret = $html->find( ‘a‘, 0 );

// Find lastest anchor, returns element object or null if not found (zero based)
$ret = $html->find( ‘a‘, -1 );

// Find all with the id attribute
$ret = $html->find( ‘div[id]‘ );

// Find all which attribute id=foo
$ret = $html->find( ‘div[id=foo]‘ );

// Find all element which id=foo
$ret = $html->find( ‘#foo‘ );

// Find all element which class=foo
$ret = $html->find( ‘.foo‘ );

// Find all element has attribute id
$ret = $html->find( ‘*[id]‘ );

// Find all anchors and images
$ret = $html->find( ‘a, img‘ );

// Find all anchors and images with the «title» attribute
$ret = $html->find( ‘a[title], img[title]‘ );

Supports these operators in attribute selectors:

Filter Description

[attribute] Matches elements that have the specified attribute.

[!attribute] Matches elements that don’t have the specified attribute.

[attribute=value] Matches elements that have the specified attribute with a certain value.

[attribute!=value] Matches elements that don’t have the specified attribute with a certain value.

[attribute^=value] Matches elements that have the specified attribute and it starts with a certain value.

[attribute$=value] Matches elements that have the specified attribute and it ends with a certain value.

[attribute*=value] Matches elements that have the specified attribute and it contains a certain value.

$es = $html->find( ‘ul li‘ );

// Find Nested tags
$es = $html->find( ‘div div div‘ );

// Find all td tags with attribite align=center in table tags
$es = $html->find( »table td[align=center]‘ );

// Find all text blocks
$es = $html->find( ‘text‘ );

// Find all comment () blocks
$es = $html->find( ‘comment‘ );

foreach($html->find( ‘ul‘ ) as $ul)
foreach($ul->find( ‘li‘ ) as $li)
// do something.
>
>

How to access the HTML element’s attributes?

// Get a attribute ( If the attribute is non-value attribute (eg. checked, selected. ), it will returns true or false)
$value = $e->href;

// Set a attribute(If the attribute is non-value attribute (eg. checked, selected. ), set it’s value as true or false)
$e->href = ‘my link’ ;

// Remove a attribute, set it’s value as null!
$e->href = null ;

// Determine whether a attribute exist?
if(isset($e->href))
echo ‘href exist!’ ;

// Example
$ html = str_get_html ( «

foo bar

» ) ;
$e = $html->find( «div» , 0 );

echo $e->tag; // Returns: » div»
echo $e->outertext; // Returns: »

foo bar

»
echo $e->innertext; // Returns: » foo bar»
echo $e->plaintext; // Returns: » foo bar«

Attribute Name Usage

$e->tag Read or write the tag name of element.

$e->outertext Read or write the outer HTML text of element.

$e->innertext Read or write the inner HTML text of element.

$e->plaintext Read or write the plain text of element.

// Extract contents from HTML
echo $html->plaintext;

// Wrap a element
$e->outertext = » . $e->outertext . ‘ ‘;

// Remove a element, set it’s outertext as an empty string
$e->outertext = » ;

// Append a element
$e->outertext = $e->outertext . ‘foo ‘;

// Insert a element
$e->outertext = ‘foo ‘ . $e->outertext;

How to traverse the DOM tree?

// If you are not so familiar with HTML DOM, check this link to learn more.

// Example
echo $html->find( «#div1», 0 )->children( 1 )->children( 1 )->children( 2 )-> id ;
// or
echo $html->getElementById( «div1» )->childNodes( 1 )->childNodes( 1 )->childNodes( 2 )->getAttribute( ‘id’ );

Источник

Читайте также: Php массивы поиск ключей