byjg/anydataset-text

Anydataset 文本文件抽象。Anydataset 是 PHP 中的一种通用数据源抽象层。

4.9.0 2024-01-04 02:23 UTC

This package is auto-updated.

Last update: 2024-09-12 01:18:10 UTC


README

Build Status Opensource ByJG GitHub source GitHub license GitHub release

文本文件抽象数据集。Anydataset 是 PHP 中的一种通用数据源抽象层。

了解更多关于 Anydataset 的信息 这里

示例

文本文件分隔符(CSV)

此类文件使用分隔符定义每个字段。最常见的形式是 CSV,但您也可以根据自己的正则表达式使用。TextFileIterator 类有三个预定义格式的常量

  • TextFileDataset::CSVFILE - 一个通用文件定义。它接受 ,; 作为分隔符。
  • TextFileDataset::CSVFILE_COMMA - CSV 文件。它只接受 , 作为分隔符。
  • TextFileDataset::CSVFILE_SEMICOLON - CSV 变体。它只接受 ; 作为分隔符。

example1.csv

Joao;Magalhaes
John;Doe
Jane;Smith

example1.php

<?php
$file = file_get_contents("example1.csv");
    
$dataset = \ByJG\AnyDataset\Text\TextFileDataset::getInstance($file)
    ->withFields(["name", "surname"])
    ->withFieldParser(\ByJG\AnyDataset\Text\TextFileDataset::CSVFILE);
$iterator = $dataset->getIterator();

foreach ($iterator as $row) {
    echo $row->get('name');     // Print "Joao", "John", "Jane"
    echo $row->get('surname');  // Print "Magalhaes", "Doe", "Smith"
}

Text File Delimited (CSV) - 从第一行获取字段名

example2.csv

firstname;lastname
John;Doe
Jane;Smith

example2.php

<?php
$file = file_get_contents("example2.csv");
    
// If omit `withFields` will get the field names from first line of the file
$dataset = \ByJG\AnyDataset\Text\TextFileDataset::getInstance($file)
    ->withFieldParser(\ByJG\AnyDataset\Text\TextFileDataset::CSVFILE);
$iterator = $dataset->getIterator();

foreach ($iterator as $row) {
    echo $row->get('firstname');     // Print "John", "Jane"
    echo $row->get('lastname');  // Print "Doe", "Smith"
}

文本文件固定大小列

此文件通过其在行上的位置定义字段。必须为每个字段定义名称、类型、位置和字段长度以解析文件。此定义还允许根据值设置所需的值和子类型。

字段定义是通过枚举 FixedTextDefinition 创建的,并具有以下字段

$definition = new FixedTextDefinition(
    $fieldName,      # The field name
    $startPos,       # The start position of this field in the row
    $length,         # The number of characteres of the field content
    $type,           # (optional) The type of the field content. FixedTextDefinition::TYPE_NUMBER or FixedTextDefinition::TYPE_STRING (default)
    $requiredValue,  # (optional) an array of valid values. E.g. ['Y', 'N']
    $subTypes = array(), # An associative array of FixedTextDefinition. If the value matches with the key of the associative array, then a sub set
                         # of FixedTextDefinition is processed. e.g.
                         # [
                         #    "Y" => [
                         #      new FixedTextDefinition(...),
                         #      new FixedTextDefinition(...),
                         #    ],
                         #    "N" => new FixedTextDefinition(...)
                         # ]
);

示例

<?php
$file = "".
    "001JOAO   S1520\n".
    "002GILBERTS1621\n";

$fieldDefinition = [
    new \ByJG\AnyDataset\Text\Enum\FixedTextDefinition('id', 0, 3, FixedTextDefinition::TYPE_NUMBER),
    new \ByJG\AnyDataset\Text\Enum\FixedTextDefinition('name', 3, 7, FixedTextDefinition::TYPE_STRING),
    new \ByJG\AnyDataset\Text\Enum\FixedTextDefinition('enable', 10, 1, FixedTextDefinition::TYPE_STRING, ['S', 'N']), // Required values --> S or N
    new \ByJG\AnyDataset\Text\Enum\FixedTextDefinition('code', 11, 4, FixedTextDefinition::TYPE_NUMBER),
];

$dataset = new \ByJG\AnyDataset\Text\FixedTextFileDataset($file)
    ->withFieldDefinition($fieldDefinition);

$iterator = $dataset->getIterator();
foreach ($iterator as $row) {
    echo $row->get('id');
    echo $row->get('name');
    echo $row->get('enabled');
    echo $row->get('code');
}

带有字段条件类型的文本文件固定大小列

<?php
$file = "".
    "001JOAO   S1520\n".
    "002GILBERTS1621\n";

$fieldDefinition = [
    new \ByJG\AnyDataset\Text\Enum\FixedTextDefinition('id', 0, 3),
    new \ByJG\AnyDataset\Text\Enum\FixedTextDefinition('name', 3, 7),
    new \ByJG\AnyDataset\Text\Enum\FixedTextDefinition(
        'enable',
        10,
        1,
        FixedTextDefinition::TYPE_STRING,
        null,
        [
            "S" => [
                new \ByJG\AnyDataset\Text\Enum\FixedTextDefinition('first', 11, 1),
                new \ByJG\AnyDataset\Text\Enum\FixedTextDefinition('second', 12, 3),
            ],
            "N" => [
                new \ByJG\AnyDataset\Text\Enum\FixedTextDefinition('reason', 11, 4),
            ]
        ]
    ),
];

$dataset = new \ByJG\AnyDataset\Text\FixedTextFileDataset($file)
    ->withFieldDefinition($fieldDefinition);

$iterator = $dataset->getIterator();
foreach ($iterator as $row) {
    echo $row->get('id');
    echo $row->get('name');
    echo $row->get('enabled');
    echo $row->get('first');       // Not empty if `enabled` == "S"
    echo $row->get('second');      // Not empty if `enabled` == "S"
    echo $row->get('reason');      // Not empty if `enabled` == "N"
}

从远程 URL 读取

TextFileDataset 和 FixedTextFileDataset 都支持从远程 http 或 https 读取文件

格式化程序

此包实现了两个格式化程序

  • CSVFormatter - 将内容输出为 CSV 文件(字段分隔符)
  • FixedSizeColumnFormatter - 以定义长度的列输出内容。

点击这里 了解有关格式化程序更多信息。

CSVFormatter

$formatter = new CSVFormatter($anydataset->getIterator());
$formatter->setDelimiter(string);  # Default: ,
$formatter->setQuote(string);  # Default: "
$formatter->setApplyQuote(APPLY_QUOTE_ALWAYS | APPLY_QUOTE_WHEN_REQUIRED | APPLY_QUOTE_ALL_STRINGS | NEVER_APPLY_QUOTE); # Default: APPLY_QUOTE_WHEN_REQUIRED
$formatter->setOutputHeader(true|false);  # Default: true
$formatter->toText();

FixedSizeColumnFormatter

$fieldDefinition = [ ... ];  # See above about field defintion

$formatter = new FixedSizeColumnFormatter($anydataset->getIterator(), $fieldDefinition);
$formatter->setPadNumner(string);  # Default: 0
$formatter->setPadString(string);  # Default: space character
$formatter->toText();

安装

composer require "byjg/anydataset-text"

运行单元测试

vendor/bin/phpunit

依赖关系

开源 ByJG