ilmlv/proxy-scraper

用于抓取免费代理列表并验证代理功能的库

v0.1-beta.3 2022-12-18 16:47 UTC

This package is auto-updated.

Last update: 2024-09-18 20:19:49 UTC


README

此库旨在抓取免费代理资源,并单独验证这些功能。支持 http/https/socks4/socks5 代理。

警告!请注意,免费公共代理不建议用于敏感数据传输。

请查看所有示例

安装

推荐安装方法是使用 composer

composer require ilmlv/proxy-scraper

代理抓取源

当前实现的代理源

请随时要求更多源。

代理抓取器

请注意,已经准备了多种类型的抓取库,可用于简化创建自己的源抓取器。目前支持的源数据类型

代理验证

此库还可以用于代理功能验证

  • 匿名级别:
    • 精英(不暴露原始 IP,没有代理相关的头部),
    • 匿名(有代理相关的头部),
    • 暴露(有原始 IP 暴露)
  • 如果代理 服务器 IP 与执行请求的服务器相匹配
  • HTTPS 请求支持
  • 各种 请求方法:GET、POST、PUT、OPTIONS、HEAD、DELETE、PATCH
  • 大量 请求头部,如果它们未被代理修改 - 在每种请求方法中测试
  • 多个公共 域名(amazon.com、craigslist.org、example.com、google.com、ss.com)
  • 平均 延迟 计算

验证示例

$validation = new IlmLV\ProxyScraper\Validations\ProxyValidation('http://1.1.1.1:80');
dump($validation);

结果

{
  "valid": true,
  "anonymityLevel": "elite",
  "ip": {
    "valid": true,
    "countryIsoCode": "NL",
    "organisation": "NForce Entertainment B.V."
  },
  "http": {
    "latency": 0.54314708709717,
    "get": {
      "valid": true,
      "latency": 0.19053816795349,
      "headers": {
        "A-IM": true,
        "Accept": true,
        "Accept-Charset": true,
        "Accept-Encoding": true,
        "Accept-Language": true,
        "Accept-Datetime": true,
        "Access-Control-Request-Method": true,
        "Access-Control-Request-Headers": true,
        "Authorization": true,
        "Cache-Control": true,
        "Connection": true,
        "Cookie": true,
        "Date": true,
        "Expect": true,
        "Forwarded": true,
        "From": true,
        "If-Modified-Since": true,
        "If-None-Match": true,
        "If-Range": true,
        "Max-Forwards": true,
        "Origin": true,
        "Pragma": true,
        "Range": true,
        "Referer": true,
        "TE": true,
        "User-Agent": true,
        "Upgrade": true,
        "Via": true,
        "Warning": true,
        "DNT": true,
        "X-Requested-With": true,
        "X-CSRF-Token": true,
        "X-Real-Ip": true,
        "X-Proxy-Id": true,
        "X-Forwarded": true,
        "X-Forwarded-For": true,
        "Forwarded-For": true,
        "Forwarded-For-Ip": true,
        "Client-Ip": true,
        "X-Client-Ip": true
      }
    },
    "post": {
      "valid": false,
      "latency": null,
      "error": {
        "message": "Connection to proxy closed for \"http://whoami.serviss.it/?format=json\".",
        "file": "/proxy-scraper/vendor/symfony/http-client/Chunk/ErrorChunk.php",
        "line": "56"
      },
      "headers": {}
    },
    "put": {
      "valid": true,
      "latency": 2.1179740428925,
      "headers": {...}
    },
    "options": {
      "valid": true,
      "latency": 1.0257298946381,
      "headers": {...}
    },
    "head": {
      "valid": true,
      "latency": 1.9323780536652,
      "headers": {...}
    },
    "delete": {
      "valid": true,
      "latency": 0.52144622802734,
      "headers": {...}
    },
    "patch": {
      "valid": true,
      "latency": 0.42012906074524,
      "headers": {...}
    }
  },
  "https": {
    "latency": 0.54314708709717,
    "get": {
      "valid": true,
      "latency": 0.19053816795349,
      "headers": {...}
    },
    "post": {
      "valid": false,
      "latency": null,
      "error": {
        "message": "Connection to proxy closed for \"https://whoami.serviss.it/?format=json\".",
        "file": "/proxy-scraper/vendor/symfony/http-client/Chunk/ErrorChunk.php",
        "line": "56"
      },
      "headers": []
    },
    "put": {
      "valid": true,
      "latency": 2.1179740428925,
      "headers": {...}
    },
    "options": {
      "valid": true,
      "latency": 1.0257298946381,
      "headers": {...}
    },
    "head": {
      "valid": true,
      "latency": 1.9323780536652,
      "headers": {...}
    },
    "delete": {
      "valid": true,
      "latency": 0.52144622802734,
      "headers": {...}
    },
    "patch": {
      "valid": true,
      "latency": 0.42012906074524,
      "headers": {...}
    }
  },
  "domains": {
    "amazon.com": {
      "valid": true,
      "latency": 1.7253589630127
    },
    "craigslist.org": {
      "valid": true,
      "latency": 4.507395029068
    },
    "example.com": {
      "valid": true,
      "latency": 0.4618821144104
    },
    "google.com": {
      "valid": false,
      "latency": 0.41366505622864
    },
    "ss.com": {
      "valid": true,
      "latency": 0.44051098823547
    }
  },
  "validatedAt": {
    "date": "2022-12-12 23:09:03.938495",
    "timezone_type": 3,
    "timezone": "Europe/Riga"
  }
}

待办事项

  • 添加自定义域名验证功能
  • 减少依赖关系
  • 测试并改进对更广泛 PHP 版本的支持
  • 改进文档
  • 收紧参数严格条件
  • 添加更多代理源
  • 创建功能测试
  • 监控测试覆盖率
  • 扩展 PHP 兼容性