knyga / webextractor
使用不同的提取器(如css、xpath、regex等)从网页中提取数据
dev-master / 1.1.2.x-dev
2014-11-28 13:47 UTC
Requires
- php: >= 5.4
- diggin/diggin-bridge-guzzle-autocharsetencodingplugin: dev-master
- diggin/diggin-http-charset: dev-master
- fabpot/goutte: 1.0.*
- knyga/dotconfig: 1.0.*@dev
- symfony/event-dispatcher: 2.4.*
- tedivm/stash: 0.11.*
Requires (Dev)
- phpunit/phpunit: 3.7.*
This package is not auto-updated.
Last update: 2020-01-10 14:59:31 UTC
README
使用不同的提取器(如css、xpath、regex等)从网页中提取数据
示例
代码
<?php use WebExtractor\DataExtractor\DataExtractorFactory; use WebExtractor\DataExtractor\DataExtractorTypes; use WebExtractor\Client\Client; $factory = DataExtractorFactory::getFactory(); $extractor = $factory->createDataExtractor(DataExtractorTypes::CSS); $client = new Client; $content = $client->get('https://en.wikipedia.org/wiki/2014_Winter_Olympics'); $extractor->setContent($content); $h1 = $extractor->setSelector('h1')->extract();
更多测试请查看。
通过 Composer 安装
-
将 Composer 安装到项目根目录
curl -sS https://getcomposer.org/installer | php
-
在项目中添加
composer.json
文件{ "require": { "knyga/webextractor": "1.1.2.*@dev" } }
-
运行 Composer 安装器
php composer.phar install
许可协议
WebExtractor 在 MIT 许可协议下发布。
Oleksandr Knyga oleksandrknyga@gmail.com
Sobit Akhmedov sobit.akhmedov@gmail.com