README

X-Robots-Tag HTTP头部解析器

PHP类，用于根据 Google X-Robots-Tag HTTP头部规范解析X-Robots-Tag HTTP头部。

要求

PHP >=5.6
PHP mbstring 扩展

注意：一旦 facebook/hhvm#4277 修复，计划支持HHVM。

安装

该库通过 Composer 提供。将其添加到您的 composer.json 文件中

{
    "require": {
        "vipnytt/robotstagparser": "~0.2"
    }
}

然后运行 composer update。

入门

基本示例

获取影响您的所有规则，这包括以下内容

所有通用规则
针对您的 User-Agent 的特定规则（如果有的话）

use vipnytt\XRobotsTagParser;

$headers = [
    'X-Robots-Tag: noindex, noodp',
    'X-Robots-Tag: googlebot: noindex, noarchive',
    'X-Robots-Tag: bingbot: noindex, noarchive, noimageindex'
];

$parser = new XRobotsTagParser('myUserAgent', $headers);
$rules = $parser->getRules(); // <-- returns an array of rules

不同的方法

通过请求URL获取HTTP头部

use vipnytt\XRobotsTagParser;

$parser = new XRobotsTagParser\Adapters\Url('http://example.com/', 'myUserAgent');
$rules = $parser->getRules();

使用您现有的 GuzzleHttp 请求

use vipnytt\XRobotsTagParser;
use GuzzleHttp\Client;

$client = new GuzzleHttp\Client();
$response = $client->request('GET', 'http://example.com/');

$parser = new XRobotsTagParser\Adapters\GuzzleHttp($response, 'myUserAgent');
$array = $parser->getRules();

提供字符串形式的HTTP头部

use vipnytt\XRobotsTagParser;

$string = <<<STRING
HTTP/1.1 200 OK
Date: Tue, 25 May 2010 21:42:43 GMT
X-Robots-Tag: noindex
X-Robots-Tag: nofollow
STRING;

$parser = new XRobotsTagParser\Adapters\TextString($string, 'myUserAgent');
$array = $parser->getRules();

导出所有规则

返回一个包含所有 User-Agent 的所有规则的数组。

use vipnytt\XRobotsTagParser;

$parser = new XRobotsTagParser('myUserAgent', $headers);
$array = $parser->export();

指令

all - 对于索引或服务没有限制。
none - 等同于 noindex 和 nofollow。
noindex - 不要在搜索结果中显示此页面，并在搜索结果中不要显示“缓存”链接。
nofollow - 不要跟踪此页面上的链接。
noarchive - 不要在搜索结果中显示“缓存”链接。
nosnippet - 不要在搜索结果中显示此页面的片段。
noodp - 不要使用来自 Open Directory项目的元数据为标题或此页面的片段显示。
notranslate - 不要在搜索结果中提供此页面的翻译。
noimageindex - 不要索引此页面的图片。
unavailable_after - 在指定的日期/时间之后不要在搜索结果中显示此页面。

来源： https://developers.google.com/webmasters/control-crawl-index/docs/robots_meta_tag

vipnytt / robotstagparser

维护者

详细信息