ilmlv / proxy-scraper
用于抓取免费代理列表并验证代理功能的库
v0.1-beta.3
2022-12-18 16:47 UTC
Requires
- php: ^8.0
- ext-dom: *
- ext-json: *
- ext-simplexml: *
- dragonmantank/cron-expression: ^3.3
- symfony/css-selector: ^6.2
- symfony/dom-crawler: ^6.1
- symfony/http-client: ^6.1
Requires (Dev)
- symfony/var-dumper: ^6.1
This package is auto-updated.
Last update: 2024-09-18 20:19:49 UTC
README
此库旨在抓取免费代理资源,并单独验证这些功能。支持 http/https/socks4/socks5 代理。
警告!请注意,免费公共代理不建议用于敏感数据传输。
请查看所有示例。
安装
推荐安装方法是使用 composer
composer require ilmlv/proxy-scraper
代理抓取源
当前实现的代理源
- blogspotproxy.blogspot.com
- checkerproxy.net
- clarketm/proxy-list
- free-proxy-list.net
- free-proxy-list.net/anonymous-proxy.html
- free-proxy-list.net/uk-proxy.html
- gimmeproxy.com
- multiproxy.org
- proxyserverlist24.top
- pubproxy.com
- ShiftyTR/Proxy-List
- ShiftyTR/Proxy-List/https.txt
- ShiftyTR/Proxy-List/socks4.txt
- ShiftyTR/Proxy-List/socks5.txt
- socks-proxy.net
- sslproxies.org
- TheSpeedX/PROXY-List/http.txt
- TheSpeedX/PROXY-List/socks4.txt
- TheSpeedX/PROXY-List/socks5.txt
- us-proxy.org
请随时要求更多源。
代理抓取器
请注意,已经准备了多种类型的抓取库,可用于简化创建自己的源抓取器。目前支持的源数据类型
代理验证
此库还可以用于代理功能验证
- 匿名级别:
- 精英(不暴露原始 IP,没有代理相关的头部),
- 匿名(有代理相关的头部),
- 暴露(有原始 IP 暴露)
- 如果代理 服务器 IP 与执行请求的服务器相匹配
- HTTPS 请求支持
- 各种 请求方法:GET、POST、PUT、OPTIONS、HEAD、DELETE、PATCH
- 大量 请求头部,如果它们未被代理修改 - 在每种请求方法中测试
- 多个公共 域名(amazon.com、craigslist.org、example.com、google.com、ss.com)
- 平均 延迟 计算
验证示例
$validation = new IlmLV\ProxyScraper\Validations\ProxyValidation('http://1.1.1.1:80'); dump($validation);
结果
{
"valid": true,
"anonymityLevel": "elite",
"ip": {
"valid": true,
"countryIsoCode": "NL",
"organisation": "NForce Entertainment B.V."
},
"http": {
"latency": 0.54314708709717,
"get": {
"valid": true,
"latency": 0.19053816795349,
"headers": {
"A-IM": true,
"Accept": true,
"Accept-Charset": true,
"Accept-Encoding": true,
"Accept-Language": true,
"Accept-Datetime": true,
"Access-Control-Request-Method": true,
"Access-Control-Request-Headers": true,
"Authorization": true,
"Cache-Control": true,
"Connection": true,
"Cookie": true,
"Date": true,
"Expect": true,
"Forwarded": true,
"From": true,
"If-Modified-Since": true,
"If-None-Match": true,
"If-Range": true,
"Max-Forwards": true,
"Origin": true,
"Pragma": true,
"Range": true,
"Referer": true,
"TE": true,
"User-Agent": true,
"Upgrade": true,
"Via": true,
"Warning": true,
"DNT": true,
"X-Requested-With": true,
"X-CSRF-Token": true,
"X-Real-Ip": true,
"X-Proxy-Id": true,
"X-Forwarded": true,
"X-Forwarded-For": true,
"Forwarded-For": true,
"Forwarded-For-Ip": true,
"Client-Ip": true,
"X-Client-Ip": true
}
},
"post": {
"valid": false,
"latency": null,
"error": {
"message": "Connection to proxy closed for \"http://whoami.serviss.it/?format=json\".",
"file": "/proxy-scraper/vendor/symfony/http-client/Chunk/ErrorChunk.php",
"line": "56"
},
"headers": {}
},
"put": {
"valid": true,
"latency": 2.1179740428925,
"headers": {...}
},
"options": {
"valid": true,
"latency": 1.0257298946381,
"headers": {...}
},
"head": {
"valid": true,
"latency": 1.9323780536652,
"headers": {...}
},
"delete": {
"valid": true,
"latency": 0.52144622802734,
"headers": {...}
},
"patch": {
"valid": true,
"latency": 0.42012906074524,
"headers": {...}
}
},
"https": {
"latency": 0.54314708709717,
"get": {
"valid": true,
"latency": 0.19053816795349,
"headers": {...}
},
"post": {
"valid": false,
"latency": null,
"error": {
"message": "Connection to proxy closed for \"https://whoami.serviss.it/?format=json\".",
"file": "/proxy-scraper/vendor/symfony/http-client/Chunk/ErrorChunk.php",
"line": "56"
},
"headers": []
},
"put": {
"valid": true,
"latency": 2.1179740428925,
"headers": {...}
},
"options": {
"valid": true,
"latency": 1.0257298946381,
"headers": {...}
},
"head": {
"valid": true,
"latency": 1.9323780536652,
"headers": {...}
},
"delete": {
"valid": true,
"latency": 0.52144622802734,
"headers": {...}
},
"patch": {
"valid": true,
"latency": 0.42012906074524,
"headers": {...}
}
},
"domains": {
"amazon.com": {
"valid": true,
"latency": 1.7253589630127
},
"craigslist.org": {
"valid": true,
"latency": 4.507395029068
},
"example.com": {
"valid": true,
"latency": 0.4618821144104
},
"google.com": {
"valid": false,
"latency": 0.41366505622864
},
"ss.com": {
"valid": true,
"latency": 0.44051098823547
}
},
"validatedAt": {
"date": "2022-12-12 23:09:03.938495",
"timezone_type": 3,
"timezone": "Europe/Riga"
}
}
待办事项
- 添加自定义域名验证功能
- 减少依赖关系
- 测试并改进对更广泛 PHP 版本的支持
- 改进文档
- 收紧参数严格条件
- 添加更多代理源
- 创建功能测试
- 监控测试覆盖率
- 扩展 PHP 兼容性