nickmoline/robots-checker

类,用于检查URL是否被robots排除,使用所有可能的robots排除方法

v1.0.5 2018-04-03 21:27 UTC

This package is auto-updated.

Last update: 2024-09-07 00:53:32 UTC


README

这些类允许您检查所有可以排除URL的方法。

您可以实例化以下类

  • NickMoline\Robots\RobotsTxt : 检查对应url的robots.txt文件
  • NickMoline\Robots\Status : 检查可索引URL的HTTP状态码
  • NickMoline\Robots\Header : 检查HTTP X-Robots-Tag Header
  • NickMoline\Robots\Meta : 检查<meta name="robots">标签(以及特定于机器人的标签)
  • NickMoline\Robots\All : 包装类,将运行上述所有检查

示例用法

<?php

require NickMoline\Robots\RobotsTxt;
require NickMoline\Robots\Header as RobotsHeader;
require NickMoline\Robots\All as RobotsAll;

$checker = new RobotsTxt("http://www.example.com/test.html");
$allowed = $checker->verify();                              // By default it checks Googlebot
$allowed = $checker->setUserAgent("bingbot")->verify();     // Checks to see if blocked for bingbot by robots.txt file

echo $checker->getReason();              // Get the reason the url is allowed or denied

$checker2 = new RobotsHeader("http://www.example.com/test.html");
$allowed = $checker->verify();           // Same as above but will test the X-Robots-Tag HTTP headers

$checkerAll = new RobotsAll("http://www.example.com/test.html");
$allowed = $checker->verify();           // This one runs all of the available tests