herilesmana / c45

0.3.2 2022-05-25 17:10 UTC

This package is auto-updated.

Last update: 2024-09-09 05:00:26 UTC


README

纯PHP实现的C45算法

注意:此实现不支持数据训练的数值类型。我没有计划实现它。请随意发送pull request。

安装

安装此扩展的首选方式是通过composer。

运行以下命令之一:

php composer.phar require --prefer-dist herilesmana/c45 "*"

或者将以下内容添加到您的composer.json文件的require部分:

"herilesmana/c45": "*"

要求

此库不需要任何特殊配置。您可以直接使用。但是,为了使此库按预期运行,必须满足一些要求。

您可以使用CSV文件作为训练数据的输入。

  1. CSV文件的的第一行必须是属性名,并且必须用双引号引起来。 .
  2. 您可以使用数组数据
  3. 使用

用法

使用CSV文件

假设您有以下目录结构:

.
├── src
|   ├── index.php
|   └── data.csv
├── vendor
├── composer.json

以下是一个有效的CSV训练示例

"outlook", "windy", "humidity", "play"
sunny, false, high, no
sunny, true, high, no
sunny, false, high, no
sunny, false, medium, yes
sunny, true, medium, yes
overcast, false, medium, yes
overcast, true, medium, yes
overcast, true, high, yes
overcast, false, medium, yes
rain, false, high, yes
rain, false, medium, yes
rain, true, medium, no
rain, false, medium, yes
rain, true, medium, no

创建新文件(例如使用记事本等),并保存为data.csv

您可以在index.php中使用PHP-C45如下:

<?php

require_once __DIR__ . '/../vendor/autoload.php';

use C45\C45;

$filename = __DIR__ . '/data.csv';

$c45 = new C45([
                'targetAttribute' => 'play',
                'type' => 'file',
                'trainingData' => $filename,
                'splitCriterion' => C45::SPLIT_GAIN,
            ]);

$tree = $c45->buildTree();
$treeString = $tree->toString();

// print generated tree
echo '<pre>';
print_r($treeString);
echo '</pre>';

$testingData = [
    'outlook' => 'sunny',
    'windy' => 'false',
    'humidity' => 'high',
];

echo $tree->classify($testingData); // prints 'no'

使用数组数据

您需要2个数组数据

第一个是属性,例如

[
    "outlook",
    "windy",
    "humidity",
    "play"
]

第二个是数据,例如

[
    [
        "outlook" => "sunny",
        "windy" => "false",
        "humidity" => "high",
        "play" => "no",
    ],
    [
        "outlook" => "sunny",
        "windy" => "true",
        "humidity" => "high",
        "play" => "no",
    ],
    [
        "outlook" => "sunny",
        "windy" => "false",
        "humidity" => "high",
        "play" => "no",
    ],
    [
        "outlook" => "sunny",
        "windy" => "false",
        "humidity" => "medium",
        "play" => "yes",
    ],
    [
        "outlook" => "sunny",
        "windy" => "true",
        "humidity" => "medium",
        "play" => "yes",
    ],
    [
        "outlook" => "overcast",
        "windy" => "false",
        "humidity" => "medium",
        "play" => "yes",
    ],
    [
        "outlook" => "overcast",
        "windy" => "true",
        "humidity" => "medium",
        "play" => "yes",
    ],
    [
        "outlook" => "overcast",
        "windy" => "true",
        "humidity" => "high",
        "play" => "yes",
    ],
    [
        "outlook" => "overcast",
        "windy" => "false",
        "humidity" => "medium",
        "play" => "yes",
    ],
    [
        "outlook" => "rain",
        "windy" => "false",
        "humidity" => "high",
        "play" => "yes",
    ],
    [
        "outlook" => "rain",
        "windy" => "false",
        "humidity" => "medium",
        "play" => "yes",
    ],
    [
        "outlook" => "rain",
        "windy" => "true",
        "humidity" => "medium",
        "play" => "no",
    ],
    [
        "outlook" => "rain",
        "windy" => "false",
        "humidity" => "medium",
        "play" => "yes",
    ],
    [
        "outlook" => "rain",
        "windy" => "true",
        "humidity" => "medium",
        "play" => "no",
    ]
]

您可以在index.php中使用PHP-C45如下:

<?php

require_once __DIR__ . '/../vendor/autoload.php';

use C45\C45;

$attributes = [...];
$data = [...];

$c45 = new C45([
                'targetAttribute' => 'play',
                'type' => 'array',
                'trainingData' => [
                    'attributes' => $attributes,
                    'data' => $data
                ],
                'splitCriterion' => C45::SPLIT_GAIN,
            ]);

$tree = $c45->buildTree();
$treeString = $tree->toString();

// print generated tree
echo '<pre>';
print_r($treeString);
echo '</pre>';

$testingData = [
    'outlook' => 'sunny',
    'windy' => 'false',
    'humidity' => 'high',
];

echo $tree->classify($testingData); // prints 'no'