maxlen / webcrawler
搜索引擎爬虫
dev-master
2017-03-29 13:46 UTC
Requires
- php: >=5.4.0
- electrolinux/phpquery: dev-master
- guzzlehttp/guzzle: ~6.0
This package is not auto-updated.
Last update: 2024-09-14 19:20:21 UTC
README
搜索引擎爬虫
Google 示例
$proxy = []; //['host' => '*.*.*.*', 'port' => '', 'login' => '', 'password' => '']
$params = ['query' => 'test search', 'page' => $page, 'proxy' => $proxy];
$crawler = new WebCrawler(['strategy' => new GoogleSearch()]);
print_r($crawler->crawl($params));
站点解析示例
$params = ['url' => 'http://your-site.com', 'proxy' => []];
$crawler = new WebCrawler(['strategy' => new SiteSearch()]);
print_r($crawler->crawl($params));