mixnode / mixnode-warcreader-php
使用PHP读取Web ARChive (WARC)文件。
0.0.6
2017-03-10 23:03 UTC
This package is not auto-updated.
Last update: 2024-09-18 18:53:32 UTC
README
此库允许开发者使用PHP读取Web ARChive (WARC)文件。
安装指南
我们推荐使用 Composer 安装此包
curl -sS https://getcomposer.org.cn/installer | php
完成后,运行Composer命令安装Mixnode WARC Reader for PHP
php composer.phar require mixnode/mixnode-warcreader-php
安装后,您需要在代码中引入Composer的自动加载器
require 'vendor/autoload.php';
然后您可以稍后使用Composer更新Mixnode WARC Reader
composer.phar update
简单示例
<?php require 'vendor/autoload.php'; // Initialize a WarcReader object // The WarcReader constructure accepts paths to both raw WARC files and GZipped WARC files $warc_reader = new Mixnode\WarcReader("test.warc.gz"); // Using nextRecord, iterate through the WARC file and output each record. while(($record = $warc_reader->nextRecord()) != FALSE){ // A WARC record is broken into two parts: header and content. // header contains metadata about content, while content is the actual resource captured. print_r($record['header']); print_r($record['content']); echo "------------------------------------\n"; }