uatthaphon / g-crawler
一个简单的PHP网页爬虫,封装了Guzzle和DomCrawler
dev-master
2019-06-29 05:32 UTC
Requires
- php: ^7.2
- guzzlehttp/guzzle: ^6.3
- symfony/css-selector: ^4.2
- symfony/dom-crawler: ^4.2
Requires (Dev)
- phpunit/phpunit: ^8.1
This package is auto-updated.
Last update: 2024-09-29 05:42:17 UTC
README
一个简单的PHP网页爬虫,封装了Guzzle和DomCrawler
安装
将包依赖添加到您的项目中
composer require uatthaphon/g-crawler
用法
在您的PHP项目中
一旦GCrawler被包含到您的项目中,您可以通过简单的init将其添加到任何类中。
use GCrawler\GCrawler; class Example { protected $_gCrawler; public function __construct() { $this->_gCrawler = new GCrawler($config); } public function run() { $crawler = $_gCrawler->crawler('https://www.example.com/'); $text = $crawler->filter('div.here') ->each(function ($node) { return $node->text(); }; return $text; }
或者使用配置进行初始化
use GCrawler\GCrawler; class Example { protected $_gCrawler; public function __construct() { $config = [ 'headers' => [ 'User-Agent' => 'testing/1.0', 'Accept' => 'application/json', 'X-Foo' => ['Bar', 'Baz'], ] ]; $this->_gCrawler = new GCrawler($config); } public function run() { $crawler = $_gCrawler->crawler('https://www.example.com/'); $text = $crawler->filter('div.here') ->each(function ($node) { return $node->text(); }; return $text; }
许可证
g-crawler在MIT许可证下发布。