README

PHP DOM 类的包装器，处理常见的 DOM 扩展陷阱。

https://travis-ci.com/kuria/dom.svg?branch=master

内容

特性
需求
容器方法
使用示例

特性

HTML 文档
- 编码嗅探
- 可选的 tidy 支持（自动修复损坏的 HTML）
HTML 片段
XML 文档
XML 片段
XPath 查询
从头创建文档
可选的错误抑制
常见任务的辅助方法，例如
- 查询多个或单个节点
- 检查包含关系
- 移除节点
- 移除列表中的所有节点
- 在子节点前添加节点
- 在另一个节点后插入节点
- 获取 <head> 和 <body> 元素（HTML）
- 获取根元素（XML）

需求

PHP 7.1+

容器方法

这些方法由 HTML 和 XML 容器共享。

加载文档

<?php

use Kuria\Dom\HtmlDocument; // or XmlDocument, HtmlFragment, etc.

// using loadString()
$dom = new HtmlDocument();
$dom->setLibxmlFlags($customLibxmlFlags); // optional
$dom->setIgnoreErrors($ignoreErrors); // optional
$dom->loadString($html);

// using static loadString() shortcut
$dom = HtmlDocument::fromString($html);

// using existing document instance
$dom = new HtmlDocument();
$dom->loadDocument($document);

// using static loadDocument() shortcut
$dom = HtmlDocument::fromDocument($document);

// creating an empty document
$dom = new HtmlDocument();
$dom->loadEmpty();

获取或更改文档编码

<?php

// get encoding
$encoding = $dom->getEncoding();

// set encoding
$dom->setEncoding($newEncoding);

注意

DOM 扩展使用 UTF-8 编码。

这意味着文本节点、属性等。

在读取时将使用 UTF-8 编码（例如 $elem->textContent）
在写入时应使用 UTF-8 编码（例如 $elem->setAttribute()）

通过 setEncoding() 配置的编码在保存文档时使用，请参阅保存文档。

保存文档

<?php

// entire document
$content = $dom->save();

// single element
$content = $dom->save($elem);

// children of a single element
$content = $dom->save($elem, true);

获取 DOM 实例

在加载文档后，DOM 实例可以通过获取器访问

<?php

$document = $dom->getDocument();
$xpath = $dom->getXpath();

运行 XPath 查询

<?php

// get a DOMNodeList
$divs = $dom->query('//div');

// get a single DOMNode (or null)
$div = $dom->query('//div');

// check if a query matches
$divExists = $dom->exists('//div');

转义字符串

<?php

$escapedString = $dom->escape($string);

DOM 操作和遍历助手

辅助通常需要的任务，这些任务通过现有的 DOM 方法不易实现

<?php

// check if the document contains a node
$hasNode = $dom->contains($node);

// check if a node contains another node
$hasNode = $dom->contains($node, $parentNode);

// remove a node
$dom->remove($node);

// remove a list of nodes
$dom->removeAll($nodes);

// prepend a child node
$dom->prependChild($newNode, $existingNode);

// insert a node after another node
$dom->insertAfter($newNode, $existingNode);

使用示例

HTML 文档

加载现有文档

<?php

use Kuria\Dom\HtmlDocument;

$html = <<<HTML
<!doctype html>
<html>
    <head>
        <meta charset="UTF-8">
        <title>Example document</title>
    </head>
    <body>
        <h1>Hello world!</h1>
    </body>
</html>
HTML;

$dom = HtmlDocument::fromString($html);

var_dump($dom->queryOne('//title')->textContent);
var_dump($dom->queryOne('//h1')->textContent);

输出

string(16) "Example document"
string(12) "Hello world!"

可选地，在加载之前，可以通过 Tidy 修复标记。

<?php

$dom = new HtmlDocument();
$dom->setTidyEnabled(true);
$dom->loadString($html);

注意

HTML 文档默认忽略错误，因此无需调用 $dom->setIgnoreErrors(true)。

创建新文档

<?php

use Kuria\Dom\HtmlDocument;

// initialize empty document
$dom = new HtmlDocument();
$dom->loadEmpty(['formatOutput' => true]);

// add <title>
$title = $dom->getDocument()->createElement('title');
$title->textContent = 'Lorem ipsum';

$dom->getHead()->appendChild($title);

// save
echo $dom->save();

输出

<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<title>Lorem ipsum</title>
</head>
<body>
    </body>
</html>

HTML 片段

加载现有片段

<?php

use Kuria\Dom\HtmlFragment;

$dom = HtmlFragment::fromString('<div id="test"><span>Hello</span></div>');

$element = $dom->queryOne('/div[@id="test"]/span');

if ($element) {
    var_dump($element->textContent);
}

输出

string(5) "Hello"

注意

HTML 片段默认忽略错误，因此无需调用 $dom->setIgnoreErrors(true)。

创建新片段

<?php

use Kuria\Dom\HtmlFragment;

// initialize empty fragment
$dom = new HtmlFragment();
$dom->loadEmpty(['formatOutput' => true]);

// add <a>
$link = $dom->getDocument()->createElement('a');
$link->setAttribute('href', 'http://example.com/');
$link->textContent = 'example';

$dom->getBody()->appendChild($link);

// save
echo $dom->save();

输出

<a href="http://example.com/">example</a>

XML 文档

加载现有文档

<?php

use Kuria\Dom\XmlDocument;

$xml = <<<XML
<?xml version="1.0" encoding="utf-8"?>
<library>
    <book name="Don Quixote" author="Miguel de Cervantes" />
    <book name="Hamlet" author="William Shakespeare" />
    <book name="Alice's Adventures in Wonderland" author="Lewis Carroll" />
</library>
XML;

$dom = XmlDocument::fromString($xml);

foreach ($dom->query('/library/book') as $book) {
   /** @var \DOMElement $book */
   var_dump("{$book->getAttribute('name')} by {$book->getAttribute('author')}");
}

输出

string(34) "Don Quixote by Miguel de Cervantes"
string(29) "Hamlet by William Shakespeare"
string(49) "Alice's Adventures in Wonderland by Lewis Carroll"

创建新文档

<?php

use Kuria\Dom\XmlDocument;

// initialize empty document
$dom = new XmlDocument();
$dom->loadEmpty(['formatOutput' => true]);

// add <users>
$document = $dom->getDocument();
$document->appendChild($document->createElement('users'));

// add some users
$bob = $document->createElement('user');
$bob->setAttribute('username', 'bob');
$bob->setAttribute('access-token', '123456');

$john = $document->createElement('user');
$john->setAttribute('username', 'john');
$john->setAttribute('access-token', 'foobar');

$dom->getRoot()->appendChild($bob);
$dom->getRoot()->appendChild($john);

// save
echo $dom->save();

输出

<?xml version="1.0" encoding="UTF-8"?>
<users>
  <user username="bob" access-token="123456"/>
  <user username="john" access-token="foobar"/>
</users>

在 XPath 查询中处理 XML 命名空间

<?php

use Kuria\Dom\XmlDocument;

$xml = <<<XML
<?xml version="1.0" encoding="UTF-8"?>
<lib:root xmlns:lib="http://example.com/">
    <lib:book name="Don Quixote" author="Miguel de Cervantes" />
    <lib:book name="Hamlet" author="William Shakespeare" />
    <lib:book name="Alice's Adventures in Wonderland" author="Lewis Carroll" />
</lib:root>
XML;

$dom = XmlDocument::fromString($xml);

// register namespace in XPath
$dom->getXpath()->registerNamespace('lib', 'http://example.com/');

// query using the prefix
foreach ($dom->query('//lib:book') as $book) {
    /** @var \DOMElement $book */
    var_dump($book->getAttribute('name'));
}

输出

string(11) "Don Quixote"
string(6) "Hamlet"
string(32) "Alice's Adventures in Wonderland"

XML 片段

加载现有片段

<?php

use Kuria\Dom\XmlFragment;

$dom = XmlFragment::fromString('<fruits><fruit name="Apple" /><fruit name="Banana" /></fruits>');

foreach ($dom->query('/fruits/fruit') as $fruit) {
    /** @var \DOMElement $fruit */
    var_dump($fruit->getAttribute('name'));
}

输出

string(5) "Apple"
string(6) "Banana"

创建新片段

<?php

use Kuria\Dom\XmlFragment;

// initialize empty fragment
$dom = new XmlFragment();
$dom->loadEmpty(['formatOutput' => true]);

// add a new element
$person = $dom->getDocument()->createElement('person');
$person->setAttribute('name', 'John Smith');

$dom->getRoot()->appendChild($person);

// save
echo $dom->save();

输出

<person name="John Smith"/>

kuria / dom

维护者

详细信息

README

特性

需求

容器方法

加载文档

获取或更改文档编码

保存文档

获取 DOM 实例

运行 XPath 查询

转义字符串

DOM 操作和遍历助手

使用示例

HTML 文档

加载现有文档

创建新文档

HTML 片段

加载现有片段

创建新片段

XML 文档

加载现有文档

创建新文档

在 XPath 查询中处理 XML 命名空间

XML 片段

加载现有片段

创建新片段