fiedsch / datamanagement
一个从文本文件中读取数据的辅助库
1.3.1
2024-08-25 06:51 UTC
Requires
- php: ^8.0
- ext-ctype: *
- ext-json: *
- ext-mbstring: *
- league/csv: ^9.6
- pimple/pimple: ~3.0
Requires (Dev)
- phpunit/phpunit: ^9.0
- symfony/var-dumper: ^v6.4
README
用于从文本文件中读取数据的PHP类和辅助函数
- Data\FileReader 读取文本文件
- Data\CsvFileReader 读取CSV文件
- Data\FixedWidthReader 读取包含固定宽度列数据的文本文件
- Data\Helper 包含诸如
SC()
的辅助函数,该函数将电子表格列名转换为(例如)CsvFileReader->getLine()
生成的数组索引
示例
处理CSV数据
<?php require __DIR__ . '/../vendor/autoload.php'; use Fiedsch\Data\File\CsvReader; try { $reader = new CsvReader("testdata.csv", ";"); // Read and handle all lines containing data. while (($line = $reader->getLine()) !== null) { // ignore empty lines (i.e. lines containing no data) if (!$reader->isEmpty($line)) { print_r($line); } } // $reader->close(); // not needed as it will be automatically called when there are no more lines } catch (Exception $e) { print $e->getMessage() . "\n"; }
特性
从v0.3.2版本开始,典型的样板代码 "打开文件、读取每行非空数据、关闭文件" 可以用更花哨的方式编写。使用 getLine()
的可选参数
<?php while (($line = $reader->getLine(Reader::SKIP_EMPTY_LINES)) !== null) { print_r($line); }
数据增强
<?php require __DIR__ . '/../vendor/autoload.php'; use Fiedsch\Data\File\CsvReader; use Fiedsch\Data\Augmentation\Augmentor; use Fiedsch\Data\Augmentation\Provider\TokenServiceProvider; use Fiedsch\Data\File\CsvWriter; try { $augmentor = new Augmentor(); $augmentor->register(new TokenServiceProvider()); $augmentor->addRule('token', function (Augmentor $augmentor, $data) { return [ 'token' => $augmentor['token']->getUniqueToken() ]; }); $reader = new CsvReader("testdata.csv", ";"); $writer = new CsvWriter("testdata.augmented.txt", "\t"); $header_written = false; while (($line = $reader->getLine(Reader::SKIP_EMPTY_LINES)) !== null) { $result = $augmentor->augment($line); if (!$header_written) { $writer->printLine(array_merge(['input_line'], array_keys($result), $reader->getHeader())); $header_written = true; } $writer->printLine(array_merge([$reader->getLineNumber()], $result, $line)); } $writer->close(); } catch (Exception $e) { print $e->getMessage() . "\n"; }
创建标记
方法一:让 TokenCreator
确保我们拥有唯一的标记
<?php require __DIR__ . '/../vendor/autoload.php'; use Fiedsch\Data\Utility\TokenCreator; use Fiedsch\Data\File\Writer; $creator = new TokenCreator(10, TokenCreator::UPPER); $output = new Writer('mytokens.txt'); $numTokens = 1000; while ($numTokens-- > 0) { $token = $creator->getUniqueToken(); $output->printLine([$token]); } $output->close();
方法二:先生成标记,然后检查它们是否唯一。对于大量标记,这可能会更快,并且消耗更少的资源
// same as above, exept // $token = $creator->getUniqueToken(); // becomes $token = $creator->cretateToken();
检查生成的标记是否唯一
echo " both lines show the same numbers, there were no duplicate tokens" wc -l mytokens.csv sort mytokens.csv | uniq | wc -l