sbwerewolf/xml-navigator

简单的 XML 到 PHP 数组的转换和快速的 XML 流式转换

v7.2.11 2024-06-15 09:46 UTC

README

PHP 库 Xml Navigator 基于 XMLReader

您可以将 XML 作为字符串或 URI(或文件系统路径)分配。

导航器可以提供 XML 文档作为数组或对象。

如何使用

$xml =<<<XML
<outer any_attrib="attribute value">
    <inner>element value</inner>
    <nested nested-attrib="nested attribute value">nested element value</nested>
</outer>
XML;
$result =
    \SbWereWolf\XmlNavigator\Convertation\FastXmlToArray
    ::prettyPrint($xml);
echo json_encode($result, JSON_PRETTY_PRINT);

输出

{
  "outer": {
    "@attributes": {
      "any_attrib": "attribute value"
    },
    "inner": "element value",
    "nested": {
      "@value": "nested element value",
      "@attributes": {
        "nested-attrib": "nested attribute value"
      }
    }
  }
}

如何安装

composer require sbwerewolf/xml-navigator

使用案例

无需担心文件大小即可处理 XML 文件

访问第一个元素的时间不依赖于文件大小。

让我们用一个例子来解释。

首先通过脚本生成 XML 文件

function generateFile(string $filename, int $limit, string $xml): void
{
    $file = fopen($filename, 'a');
    fwrite($file, '<Collection>');

    for ($i = 0; $i < $limit; $i++) {
        $content = "$xml$xml$xml$xml$xml$xml$xml$xml$xml$xml";
        fwrite($file, $content);
    }

    fwrite($file, '</Collection>');
    fclose($file);

    $size = round(filesize($filename) / 1024, 2);
    echo "$filename size is $size Kb" . PHP_EOL;
}

$xml = '<SomeElement key="123">value</SomeElement>' . PHP_EOL;
$generation['temp-465b.xml'] = 1;
$generation['temp-429Kb.xml'] = 1_000;
$generation['temp-429Mb.xml'] = 1_000_000;

foreach ($generation as $filename => $size) {
    generateFile($filename, $size, $xml);
}
temp-465b.xml size is 0.45 Kb
temp-429Kb.xml size is 429.71 Kb
temp-429Mb.xml size is 429687.52 Kb

现在,通过脚本运行基准测试

/**
 * @param string $filename
 * @return void
 */
function parseFirstElement(string $filename): void
{
    $start = hrtime(true);

    /** @var XMLReader $reader */
    $reader = XMLReader::open($filename);
    $mayRead = true;
    while ($mayRead && $reader->name !== 'SomeElement') {
        $mayRead = $reader->read();
    }
    
    $result =
        \SbWereWolf\XmlNavigator\Extraction\PrettyPrintComposer
        ::compose($reader);

    $finish = hrtime(true);
    $duration = $finish - $start;
    $duration = number_format($duration,);
    echo "First element parsing duration of $filename is $duration ns" .
        PHP_EOL;

    $reader->close();
}

$files = [
    'temp-465b.xml',
    'temp-429Kb.xml',
    'temp-429Mb.xml',
];

echo 'Warm up OPcache' . PHP_EOL;
parseFirstElement(current($files));

echo 'Benchmark is starting' . PHP_EOL;
foreach ($files as $filename) {
    parseFirstElement($filename);
}
echo 'Benchmark was finished' . PHP_EOL;
Warm up OPcache
First element parsing duration of temp-465b.xml is 1,250,700 ns
Benchmark is starting
First element parsing duration of temp-465b.xml is 114,400 ns
First element parsing duration of temp-429Kb.xml is 132,400 ns
First element parsing duration of temp-429Mb.xml is 119,900 ns
Benchmark was finished

通过回调检测适合的元素快速解析 XML

$xml = <<<XML
<?xml version="1.0" encoding="utf-8"?>
<CARPLACES>
    <CARPLACE
            ID="11356925"
            OBJECTID="20318444"
            OBJECTGUID="6e237b93-09d6-4adf-9567-e9678608543b"
            CHANGEID="31810106"
            NUMBER="1"
            OPERTYPEID="10"
            PREVID="0"
            NEXTID="0"
            UPDATEDATE="2019-07-09"
            STARTDATE="2019-07-09"
            ENDDATE="2079-06-06"
            ISACTUAL="1"
            ISACTIVE="1"
    />
    <CARPLACE
            ID="11361653"
            OBJECTID="20326793"
            OBJECTGUID="11d9f79b-be6f-43dc-bdcc-70bbfc9f86b0"
            CHANGEID="31822630"
            NUMBER="1"
            OPERTYPEID="10"
            PREVID="0"
            NEXTID="0"
            UPDATEDATE="2019-07-30"
            STARTDATE="2019-07-30"
            ENDDATE="2079-06-06"
            ISACTUAL="1"
            ISACTIVE="1"
    />
    <CARPLACE
            ID="94824"
            OBJECTID="101032823"
            OBJECTGUID="4f37e0eb-141f-4c19-b416-0ec85e2e9e76"
            CHANGEID="192339336"
            NUMBER="0"
            OPERTYPEID="10"
            PREVID="0"
            NEXTID="0"
            UPDATEDATE="2021-04-22"
            STARTDATE="2021-04-22"
            ENDDATE="2079-06-06"
            ISACTUAL="1"
            ISACTIVE="1"
    />
</CARPLACES>
XML;

$reader = XMLReader::XML($xml);

$extractor = \SbWereWolf\XmlNavigator\Parsing\FastXmlParser
    ::extractPrettyPrint(
        $reader,
        /* callback for detect desired elements */
        function (XMLReader $cursor) {
            return $cursor->name === 'CARPLACE';
        }
    );

$results = [];
foreach ($extractor as $result) {
    $results[] = $result;
}
$reader->close();

echo json_encode($results, JSON_PRETTY_PRINT) . PHP_EOL;

控制台输出将

[
  {
    "CARPLACE": {
      "@attributes": {
        "ID": "11356925",
        "OBJECTID": "20318444",
        "OBJECTGUID": "6e237b93-09d6-4adf-9567-e9678608543b",
        "CHANGEID": "31810106",
        "NUMBER": "1",
        "OPERTYPEID": "10",
        "PREVID": "0",
        "NEXTID": "0",
        "UPDATEDATE": "2019-07-09",
        "STARTDATE": "2019-07-09",
        "ENDDATE": "2079-06-06",
        "ISACTUAL": "1",
        "ISACTIVE": "1"
      }
    }
  },
  {
    "CARPLACE": {
      "@attributes": {
        "ID": "11361653",
        "OBJECTID": "20326793",
        "OBJECTGUID": "11d9f79b-be6f-43dc-bdcc-70bbfc9f86b0",
        "CHANGEID": "31822630",
        "NUMBER": "1",
        "OPERTYPEID": "10",
        "PREVID": "0",
        "NEXTID": "0",
        "UPDATEDATE": "2019-07-30",
        "STARTDATE": "2019-07-30",
        "ENDDATE": "2079-06-06",
        "ISACTUAL": "1",
        "ISACTIVE": "1"
      }
    }
  },
  {
    "CARPLACE": {
      "@attributes": {
        "ID": "94824",
        "OBJECTID": "101032823",
        "OBJECTGUID": "4f37e0eb-141f-4c19-b416-0ec85e2e9e76",
        "CHANGEID": "192339336",
        "NUMBER": "0",
        "OPERTYPEID": "10",
        "PREVID": "0",
        "NEXTID": "0",
        "UPDATEDATE": "2021-04-22",
        "STARTDATE": "2021-04-22",
        "ENDDATE": "2079-06-06",
        "ISACTUAL": "1",
        "ISACTIVE": "1"
      }
    }
  }
]

XML 文档作为数组

XmlConverter 实现了数组方法。

XmlConverter 可用于将 XML 文档转换为数组,例如

$xml = <<<XML
<complex>
    <a empty=""/>
    <b val="x"/>
    <b val="y"/>
    <b val="z"/>
    <c>0</c>
    <c v="o"/>
    <c/>
    <different/>
</complex>
XML;

$converter = new \SbWereWolf\XmlNavigator\Convertation\XmlConverter(
    \SbWereWolf\XmlNavigator\General\Notation::VAL,
    \SbWereWolf\XmlNavigator\General\Notation::ATTR,
);
$arrayRepresentationOfXml =
    $converter->toPrettyPrint($xml);
echo var_export($arrayRepresentationOfXml,true);

输出

array (
    'complex' =>
        array (
            'a' =>
                array (
                    '@attributes' =>
                        array (
                            'empty' => '',
                        ),
                ),
            'b' =>
                array (
                    0 =>
                        array (
                            '@attributes' =>
                                array (
                                    'val' => 'x',
                                ),
                        ),
                    1 =>
                        array (
                            '@attributes' =>
                                array (
                                    'val' => 'y',
                                ),
                        ),
                    2 =>
                        array (
                            '@attributes' =>
                                array (
                                    'val' => 'z',
                                ),
                        ),
                ),
            'c' =>
                array (
                    0 =>
                        array (
                            '@value' => '0',
                        ),
                    1 =>
                        array (
                            '@attributes' =>
                                array (
                                    'v' => 'o',
                                ),
                        ),
                    2 =>
                        array (
                        ),
                ),
            'different' =>
                array (
                ),
        ),
);

XML 文档作为对象

XmlElement 实现了面向对象的方法。

导航器 API

  • name(): string // 返回 XML 元素名称
  • hasValue(): bool // 如果 XML 元素具有值,则返回 true
  • value(): string // 返回 XML 元素的值
  • hasAttribute(string $name = ''): bool // 如果 XML 元素具有名为 $name 的属性,则返回 true。如果省略 $name,则返回 XML 元素是否有任何属性
  • attributes(): XmlAttribute[] // 返回 XML 元素的所有属性
  • get(string $name = null): string // 获取具有 $name 的属性的值,如果省略 $name,则返回随机属性的值
  • hasElement(string $name = ''): bool // 如果 XML 元素具有名为 $name 的嵌套元素,则返回 true。如果省略 $name,则返回 XML 元素是否有任何嵌套元素
  • elements(): IXmlElement[] // 返回所有嵌套元素
  • pull(string $name = ''): Generator // 拉取 IXmlElement 以获取嵌套元素,如果定义了 $name,则拉取具有 $name 的元素

以对象的形式与 XML 交互

$xml = <<<XML
<doc attrib="a" option="o" >
    <base/>
    <valuable>element value</valuable>
    <complex>
        <a empty=""/>
        <b val="x"/>
        <b val="y"/>
        <b val="z"/>
        <c>0</c>
        <c v="o"/>
        <c/>
        <different/>
    </complex>
</doc>
XML;

$content = \SbWereWolf\XmlNavigator\Convertation\FastXmlToArray
::convert($xml);
$navigator = 
new \SbWereWolf\XmlNavigator\Navigation\XmlElement($content);

/* get name of element */
echo $navigator->name() . PHP_EOL;
/* doc */

/* get value of element */
echo "`{$navigator->value()}`" . PHP_EOL;
/* `` */

/* get list of attributes */
$attributes = $navigator->attributes();
foreach ($attributes as $attribute) {
    /** @var \SbWereWolf\XmlNavigator\Navigation\IXmlAttribute $attribute */
    echo "`{$attribute->name()}` `{$attribute->value()}`" . PHP_EOL;
}
/*
`attrib` `a`
`option` `o`
*/

/* get value of attribute */
echo $navigator->get('attrib') . PHP_EOL;
/* a */

/* get list of nested elements */
$elements = $navigator->elements();
foreach ($elements as $element) {
    echo "{$element->name()}" . PHP_EOL;
}
/*
base
valuable
complex
 */

/* get desired nested element */
/** @var \SbWereWolf\XmlNavigator\Navigation\IXmlElement $elem */
$elem = $navigator->pull('valuable')->current();
echo $elem->name() . PHP_EOL;
/* valuable */

/* get all nested elements */
foreach ($navigator->pull() as $pulled) {
    /** @var \SbWereWolf\XmlNavigator\Navigation\IXmlElement $pulled */
    echo $pulled->name() . PHP_EOL;
    /*
    base
    valuable
    complex
    */
}

/* get nested element with given name */
/** @var \SbWereWolf\XmlNavigator\Navigation\IXmlElement $nested */
$nested = $navigator->pull('complex')->current();
/* get names of all elements of nested element */
$elements = $nested->elements();
foreach ($elements as $element) {
    echo "{$element->name()}" . PHP_EOL;
}
/*
a
b
b
b
c
c
c
different
*/

/* pull all elements with name `b` */
foreach ($nested->pull('b') as $b) {
    /** @var \SbWereWolf\XmlNavigator\Navigation\IXmlElement $b */
    echo ' element with name' .
        ' `' . $b->name() .
        '` have attribute `val` with value' .
        ' `' . $b->get('val') . '`' .
        PHP_EOL;
}
/*
 element with name `b` have attribute `val` with value `x`
 element with name `b` have attribute `val` with value `y`
 element with name `b` have attribute `val` with value `z`
*/

高级使用

单元测试 有更多使用示例,请调查它们。

运行测试

composer test

联系方式

Volkhin Nikolay
e-mail ulfnew@gmail.com
phone +7-902-272-65-35
Telegram @sbwerewolf