mosab/search-engine-crawler

php 搜索引擎爬虫,收集 5 页面结果并进行分类

dev-main 2022-02-12 11:58 UTC

This package is auto-updated.

Last update: 2024-09-16 11:41:36 UTC


README

安装

您可以通过 composer 安装此包

composer require mossab/search-engine-crawler

或者您可以将包添加到您的 composer.json 中,然后执行 composer install

  "require": {
        //...
        "mossab/search-engine-crawler": "dev-main"
    }

使用方法

首先,您需要将您的 API 密钥CX 添加到您的 .env 文件中

<?php
require __DIR__.'/vendor/autoload.php';
(new \src\configs\Config(__DIR__.'/.env'))->load(); // set your .env path

$myGenerator = new src\SearchEngine();
$myGenerator->setEngine('google.com');
$results =  $myGenerator->search(['flower','horizon']);

您将获得一个包含这些参数的 ArrayIterator 实例

- keyword being searched
- ranking (where the result was found on the search engine, the topmost result would be 0 and the last would be 50 (results per page x 5)
- url of the page (as it appears in google search)
- title of the page (as it appears in google search)
- description (as it appears in google search)
- promoted This is a boolean value indicating whether the result is an ad or organic result

您可以循环结果

foreach ($results as $result){
    echo "title is : ".$result['title']."<br>".
        "ranking is = ".$result['ranking']."<br>".
        "url is : ".$result['url']."<br>".
        "description is :".$result['description']."<br>".
        "keyword is :".$result['keyword']."<br>";
}

搜索引擎

支持搜索引擎是 google.comgoogle.ae