README

PHP Scraper

使用 PHP 爬取网络的一种有观点且有限的方法。主要目标是完成任务而不是被 xPath 选择器、准备数据结构等分散注意力。相反，您可以直接“访问网站”并获取一个包含与您的爬取项目相关的所有细节的数组。

在底层，它使用 Goutte 和几个其他包。请参阅 composer.json。

赞助商

此项目由以下组织赞助

想赞助此项目？联系我。

示例

以下是关于库工作方式的一些看法。更多示例请访问项目网站。

获取网站标题

所有爬取功能都可以通过函数调用或属性调用访问。以标题爬取为例，它将类似于这样

$web = new \spekulatius\phpscraper();

$web->go('https://google.com');

// Returns "Google"
echo $web->title;

// Also returns "Google"
echo $web->title();

从网站爬取图片

包括 img-标签属性在内的图片爬取

$web = new \spekulatius\phpscraper();

/**
 * Navigate to the test page.
 *
 * This page contains twice the image "cat.jpg".
 * Once with a relative path and once with an absolute path.
 */
$web->go('https://test-pages.phpscraper.de/meta/lorem-ipsum.html');

var_dump($web->imagesWithDetails);
/**
 * Contains:
 *
 * [
 *     'url' => 'https://test-pages.phpscraper.de/assets/cat.jpg',
 *     'alt' => 'absolute path',
 *     'width' => null,
 *     'height' => null,
 * ],
 * [
 *     'url' => 'https://test-pages.phpscraper.de/assets/cat.jpg',
 *     'alt' => 'relative path',
 *     'width' => null,
 *     'height' => null,
 * ]
 */

请参阅完整文档以获取更多信息及示例。

czepter / phpscraper

维护者

详细信息

README

PHP Scraper

赞助商

示例

获取网站标题

从网站爬取图片