denis-korolev / opencorpora
库,用于将 Opencorpora 导出文件数据从 xml 序列化为对象
v1.0.0
2022-01-18 20:29 UTC
Requires
- php: ^7.4
- ext-dom: *
- ext-simplexml: *
- jms/serializer: ^3.17
- symfony/cache: ^5.4
Requires (Dev)
- ext-xmlreader: *
- overtrue/phplint: ^2.0
- phpunit/phpunit: ^9.2
- roave/security-advisories: dev-master
- squizlabs/php_codesniffer: ^3.5
- vimeo/psalm: ^3.8
This package is auto-updated.
Last update: 2024-09-19 02:27:24 UTC
README
库,用于将 Opencorpora 导出文件数据从 xml 序列化为对象
此库将帮助您读取 Opencorpora 导出文件。在库中,我们有 5 个处理器
- GrammemeProcessor(只读取 Grammeme 节点)
- LemmaProcessor(只读取 Lemma 节点)
- LinksProcessor(只读取 Links 节点)
- LinkTypeProcessor(只读取 LinkType 节点)
- RestrictionProcessor(只读取 Restr 节点)
使用这些处理器将 XML 数据提取到简单的 DTO 对象中。XML 文件通过 PHP 库 XMLReader
和 SimpleXMLElement
节点逐个读取。这就是为什么它不占用太多内存的原因。
安装/使用
通过 composer 安装最新版本
composer require denis-korolev/opencorpora
以下是 GrammemeProcessor
的使用示例。其他处理器使用方式相同。
use JMS\Serializer\Naming\IdenticalPropertyNamingStrategy; use JMS\Serializer\Naming\SerializedNameAnnotationStrategy; use JMS\Serializer\SerializerBuilder; use Opencorpora\Dictionary\Grammeme; use Opencorpora\GrammemeProcessor; $serializer = SerializerBuilder::create()->setPropertyNamingStrategy( new SerializedNameAnnotationStrategy( new IdenticalPropertyNamingStrategy() ) ) ->build(); // path to file $fileName = $this->projectDir . DIRECTORY_SEPARATOR . 'var' . DIRECTORY_SEPARATOR . 'dict.opcorpora.xml'; $processor = new GrammemeProcessor($serializer); foreach ($processor->getData($fileName) as $grammeme) { /** * @var $grammeme Grammeme */ echo $grammeme->name; echo $grammeme->parent; echo $grammeme->description; echo $grammeme->alias; }