grithin/phpwebtools

PHP Curl, DOM, 网页数据解析工具。

v1.0.0 2015-10-09 20:14 UTC

This package is not auto-updated.

Last update: 2024-09-14 18:04:16 UTC


README

Curl

基本的curl类,用于处理各种curl函数并避免php curl的一些常见陷阱(例如:post编码比浏览器post编码大)。

常用方法

use \Grithin\Curl;

$curl = new Curl;

# can set headers either directly using 'options' or through instance attributes
$curl->options['CURLOPT_USERAGENT'] = 'harmless autobot';
$curl->user_agent = 'harmless autobot';

# can provide GET parameters as an array or as a string
$response = $curl->get('http://google.com/?s=bob');
$response = $curl->get('http://google.com/',['s'=>'bob']);


$response = $curl->post('http://google.com/',['s'=>'bob']);
$response = $curl->post('http://google.com/','s=bob');
$response = $curl->post('http://google.com/',json_encode(['bob'=>'s']));

文件上传

使用旧式方法,但效果足够好

$response = $curl->post('http://thoughtpush.com/', ['s'=>'bob'], ['file1'=>'@'.__FILE__]);

响应对象

Curl发送方法返回一个带有headers数组属性和body字符串属性的CurlResponse对象。

use \Grithin\Curl;

$curl = new Curl;
$response = $curl->get('http://thoughtpush.com');

\Grithin\Debug::out($response->headers);
/*
[base:index.php:14] 1: [
	'Http-Version' : '1.1'
	'Status-Code' : '200'
	'Status' : '200 OK'
	'Server' : 'nginx/1.4.6 (Ubuntu)'
	'Date' : 'Fri, 09 Oct 2015 19:36:01 GMT'
	...
*/
\Grithin\Debug::out($response->body);
/*
[base:index.php:15] 2: '<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
        "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html lang="en" xmlns="http://www.w3.org/1999/xhtml">
	<head>
*/

调试

$curl->options[CURLOPT_VERBOSE] = true;

DomTools

用于更好地处理DOM和xpath的方法。

阅读行内代码注释

list($dom, $xpath) = DomTools::loadHtml($response->body);

WebData

各种网页数据方法,包括用于通过解析响应和设置post参数来满足ASP验证的方法。

阅读行内代码注释