README

一个用于在 HTML 页面中读取微数据、RDFa Lite 和 JSON-LD 结构化数据的 PHP 库。

此库是用于在 brick/schema 中读取 schema.org 结构化数据的基础，但也可能与其他词汇表一起使用。

安装

此库可以通过 Composer 安装

composer require brick/structured-data

要求

此库需要 PHP 7.2 或更高版本。它使用以下扩展

这些扩展默认启用，并应在大多数 PHP 安装中可用。

项目状态和发布流程

此库处于开发状态。在早期的 0.x 版本中，它可能变化很快。然而，该库遵循严格的 BC 断裂约定

当前版本编号为 0.x.y。当引入非破坏性更改时（添加新方法、修复错误、优化现有代码等），y 将递增。

当引入破坏性更改时，始终开始新的 0.x 版本周期。

因此，将您的项目锁定到给定的发布周期，例如 0.1.* 是安全的。

如果您需要升级到较新的发布周期，请查看发布历史，了解每个后续 0.x.0 版本引入的更改列表。

简介

该库统一了在公共接口下读取 3 种支持的格式（微数据、RDFa Lite 和 JSON-LD）

interface Brick\StructuredData\Reader
{
    /**
     * Reads the items contained in the given document.
     *
     * @param DOMDocument $document The DOM document to read.
     * @param string      $url      The URL the document was retrieved from. This will be used only to resolve relative
     *                              URLs in property values. No attempt will be performed to connect to this URL.
     *
     * @return Item[] The top-level items.
     */
    public function read(DOMDocument $document, string $url) : array;
}

此接口有 3 种实现，每种格式一个

MicrodataReader
RdfaLiteReader
JsonLdReader

此 read() 方法返回文档中找到的最顶层项。每个 Item 由以下内容组成

一个可选的 id（微数据中的 itemid，RDFa Lite 中的 resource，JSON-LD 中的 @id）
一个类型数组，每个类型是一个 URL，例如 http://schema.org/Product
一个包含零个或多个属性的关联数组，每个属性都有一个 URL 作为键，例如 http://schema.org/price，并映射到一个值数组；值可以是普通字符串，也可以是嵌套的 Item 对象

快速入门

以下是一个从网页中读取微数据的示例。只需更改 URL 并尝试即可

use Brick\StructuredData\Reader\MicrodataReader;
use Brick\StructuredData\HTMLReader;
use Brick\StructuredData\Item;

// Let's read Microdata here;
// You could also use RdfaLiteReader, JsonLdReader,
// or even use all of them by chaining them in a ReaderChain
$microdataReader = new MicrodataReader();

// Wrap into HTMLReader to be able to read HTML strings or files directly,
// i.e. without manually converting them to DOMDocument instances first
$htmlReader = new HTMLReader($microdataReader);

// Replace this URL with that of a website you know is using Microdata
$url = 'http://www.example.com/';
$html = file_get_contents($url);

// Read the document and return the top-level items found
// Note: the URL is only required to resolve relative URLs; no attempt will be made to connect to it
$items = $htmlReader->read($html, $url);

// Loop through the top-level items
foreach ($items as $item) {
    echo implode(',', $item->getTypes()), PHP_EOL;

    foreach ($item->getProperties() as $name => $values) {
        foreach ($values as $value) {
            if ($value instanceof Item) {
                // We're only displaying the class name in this example; you would typically
                // recurse through nested Items to get the information you need
                $value = '(' . implode(', ', $value->getTypes()) . ')';
            }

            // If $value is not an Item, then it's a plain string

            echo "  - $name: $value", PHP_EOL;
        }
    }
}

当前限制

MicroDataReader 不支持 itemref 属性
RdfaLiteReader 不支持 prefix 属性；目前仅支持预定义前缀
JsonLdReader 不完全支持 @context；目前，仅在 @context 中接受字符串，它们被视为词汇标识符；这对于像 schema.org 上的示例中使用的简单标记来说很好，但可能与更复杂的文档失败。

关于 JSON-LD 的 `@context` 的注意事项

尽管 JsonLdReader 在未来应该能够处理适当的上下文对象，但其目标永远不会是成为一个完全符合 JSON-LD 解析器；特别是，它将 永远不会 尝试获取通过 URL 引用的 JSON-LD 上下文。

这符合索引机器人通常爬取网络的方式，它们不会获取远程上下文，这使它们从从网页中提取结构化数据时不必获取额外的文档。

JsonLdReader 的目标，以及其它 Reader 实现的目标，是能够解析具有与 Google 结构化数据测试工具或 Yandex 结构化数据验证器相同能力的文档，不多也不少。这些工具不会加载外部上下文文件。

brick / structured-data

维护者

详细信息

README

安装

要求

项目状态和发布流程

简介

快速入门

当前限制

关于 JSON-LD 的 `@context` 的注意事项

brick / structured-data

维护者

详细信息

README

安装

要求

项目状态和发布流程

简介

快速入门

当前限制

关于 JSON-LD 的 @context 的注意事项

关于 JSON-LD 的 `@context` 的注意事项