maxlen/webcrawler

搜索引擎爬虫

安装: 10

依赖: 0

建议者: 0

安全性: 0

星标: 0

关注者: 1

分支: 0

开放问题: 0

类型:扩展

dev-master 2017-03-29 13:46 UTC

This package is not auto-updated.

Last update: 2024-09-14 19:20:21 UTC


README

搜索引擎爬虫

Google 示例

  $proxy = []; //['host' => '*.*.*.*', 'port' => '', 'login' => '', 'password' => '']
  $params = ['query' => 'test search', 'page' => $page, 'proxy' => $proxy];
  $crawler = new WebCrawler(['strategy' => new GoogleSearch()]);
  print_r($crawler->crawl($params));

站点解析示例

  $params = ['url' => 'http://your-site.com', 'proxy' => []];
  $crawler = new WebCrawler(['strategy' => new SiteSearch()]);
  print_r($crawler->crawl($params));