imarc / crawler
抓取网站并获取所有链接
0.2.0
2017-11-01 19:57 UTC
Requires
- league/csv: ^9.0
- pimple/pimple: ^3.2
- respect/validation: ^1.1
- spatie/crawler: ^2.5
- symfony/console: ^3.3
Requires (Dev)
- codeception/codeception: ^2.3
This package is not auto-updated.
Last update: 2024-09-20 08:00:47 UTC
README
抓取网站并对URL进行操作。默认情况下,它将在txt文件中每行输出一个URL。
这可以很容易地扩展以执行许多其他操作。您只需创建一个新的Observer(src/Observer
)。
安装
composer require imarc/crawler
用法
在您的项目目录中: ./vendor/bin/crawler csv URL 目标
从仓库中: ./crawler.php csv URL 目标
选项
crawler --help
Usage:
csv [options] [--] <url> <destination>
Arguments:
url URL to crawl.
destination Write CSV to FILE
Options:
-s, --show-progress Show the crawl's progress
-e, --crawl-external Crawl external URLs
-q, --quiet Do not output any message
--exclude=EXCLUDE Exclude certain extensions [default: ["css","gif","ico","jpg","jpg","js","pdf","pdf","png","rss","txt"]] (multiple values allowed)
-h, --help Display this help message
-V, --version Display this application version
--ansi Force ANSI output
--no-ansi Disable ANSI output
-n, --no-interaction Do not ask any interactive question
-v|vv|vvv, --verbose Increase the verbosity of messages: 1 for normal output, 2 for more verbose output and 3 for debug
测试
codecept run