frictionlessdata / datapackage
用于处理数据包的工具库
v1.0.0
2021-08-29 15:50 UTC
Requires
- php: >=7.1
- ext-zip: *
- frictionlessdata/tableschema: ^v1.0.0
- justinrainbow/json-schema: ^5.2
Requires (Dev)
- phpunit/phpunit: ^7.5.20
- psy/psysh: @stable
- satooshi/php-coveralls: ^1.0
- squizlabs/php_codesniffer: ^3.5
README
一个用于在PHP中处理数据包的工具库。
功能摘要和用法指南
安装
composer require frictionlessdata/datapackage
可选地,为了创建zip文件,您需要PHP zip扩展。在Ubuntu上,可以使用sudo apt-get install php-zip
启用。
包
加载符合规范的data package
use frictionlessdata\datapackage\Package; $package = Package::load("tests/fixtures/multi_data_datapackage.json");
遍历资源和数据
foreach ($package as $resource) { echo $resource->name(); foreach ($resource as $row) { echo $row; } }
获取所有数据作为一个数组(将所有数据加载到内存中,不推荐用于大数据集)
foreach ($package as $resource) { var_dump($resource->read()); }
所有数据和模式都会被验证,如果出现问题会抛出异常。
显式验证数据并获取错误列表
Package::validate("tests/fixtures/simple_invalid_datapackage.json"); // array of validation errors
加载zip文件
$package = Package::load('http://datahub.io/opendatafortaxjustice/eucountrydatawb/r/datapackage_zip.zip');
提供读取选项,这些选项将传递给tableschema-php Table::read方法
$package = Package::load('http://datahub.io/opendatafortaxjustice/eucountrydatawb/r/datapackage_zip.zip'); foreach ($package as $resource) { $resource->read(["cast" => false]); }
包对象有一些有用的方法来访问和操作资源
$package = Package::load("tests/fixtures/multi_data_datapackage.json"); $package->resources(); // array of resource name => Resource object (see below for Resource class reference) $package->getResource("first-resource"); // Resource object matching the given name $package->removeResource("first-resource"); // add a tabular resource $package->addResource("tabular-resource-name", [ "profile" => "tabular-data-resource", "schema" => [ "fields" => [ ["name" => "id", "type" => "integer"], ["name" => "name", "type" => "string"] ] ], "path" => [ "tests/fixtures/simple_tabular_data.csv", ] ]);
从头创建一个新的包
$package = Package::create([ "name" => "datapackage-name", "profile" => "tabular-data-package" ]); // add a resource $package->addResource("resource-name", [ "profile" => "tabular-data-resource", "schema" => [ "fields" => [ ["name" => "id", "type" => "integer"], ["name" => "name", "type" => "string"] ] ], "path" => "tests/fixtures/simple_tabular_data.csv" ]); // save the package descriptor to a file $package->saveDescriptor("datapackage.json");
将整个data package(包括任何本地数据)保存到zip文件中
$package->save("datapackage.zip");
资源
可以从Package中访问资源对象,如上所述
$resource = $package->getResource("resource-name")
或直接实例化
use frictionlessdata\datapackage\Resource; $resource = Resource::create([ "name" => "my-resource", "profile" => "tabular-data-resource", "path" => "tests/fixtures/simple_tabular_data.csv", "schema" => ["fields" => [["name" => "id", "type" => "integer"], ["name" => "name", "type" => "string"]]] ]);
遍历或读取资源将生成所有路径或数据元素的组合行
foreach ($resource as $row) {}; // iterating $resource->read(); // get all the data as an array
贡献
请阅读贡献指南:如何贡献