ihor/phadoop

PHP编写的简单库,用于在Hadoop中编写map/reduce作业

dev-master / 0.1.x-dev 2016-01-03 23:12 UTC

This package is auto-updated.

Last update: 2024-08-28 03:27:28 UTC


README

Phadoop 允许您在PHP中编写用于Hadoop的map/reduce任务。我创建它是为了在公司内进行关于Hadoop的技术演讲。它目前尚未准备好用于生产,但可以帮助您在PHP中玩耍Hadoop。

安装

在您的composer.json文件中定义以下需求

"require": {
    "ihor/phadoop": "0.1.x-dev"
}

或者在命令行中简单执行以下操作

composer require ihor/phadoop

用法

class Mapper extends \Phadoop\MapReduce\Job\Worker\Mapper
{
    protected function map($key, $value)
    {
        $this->emit('wordsNumber', count(preg_split('/\s+/', trim((string) $value))));
    }

}

class Reducer extends \Phadoop\MapReduce\Job\Worker\Reducer
{
    protected function reduce($key, \Traversable $values)
    {
        $result = 0;
        foreach ($values as $value) {
            $result += (int) $value;
        }

        $this->emit($key, $result);
    }
}

$mr = new \Phadoop\MapReduce('<path-to-hadoop>');

$job = $mr->createJob('WordCounter', 'Temp')
    ->setMapper(new Mapper())
    ->setReducer(new Reducer())
    ->clearData()
    ->addTask('Hello World')
    ->addTask('Hello Hadoop')
    ->putResultsTo('Temp/Results.txt')
    ->run();

echo $job->getLastResults();

您可以在examples目录中找到更多示例。