frictionlessdata/datapackage

用于处理数据包的工具库

v1.0.0 2021-08-29 15:50 UTC

This package is auto-updated.

Last update: 2024-08-29 04:27:33 UTC


README

Build Coveralls Scrutinizer-ci Packagist SemVer Codebase Support

一个用于在PHP中处理数据包的工具库。

功能摘要和用法指南

安装

composer require frictionlessdata/datapackage

可选地,为了创建zip文件,您需要PHP zip扩展。在Ubuntu上,可以使用sudo apt-get install php-zip启用。

加载符合规范的data package

use frictionlessdata\datapackage\Package;
$package = Package::load("tests/fixtures/multi_data_datapackage.json");

遍历资源和数据

foreach ($package as $resource) {
    echo $resource->name();
    foreach ($resource as $row) {
        echo $row;
    }
}

获取所有数据作为一个数组(将所有数据加载到内存中,不推荐用于大数据集)

foreach ($package as $resource) {
    var_dump($resource->read());
}

所有数据和模式都会被验证,如果出现问题会抛出异常。

显式验证数据并获取错误列表

Package::validate("tests/fixtures/simple_invalid_datapackage.json");  // array of validation errors

加载zip文件

$package = Package::load('http://datahub.io/opendatafortaxjustice/eucountrydatawb/r/datapackage_zip.zip');

提供读取选项,这些选项将传递给tableschema-php Table::read方法

$package = Package::load('http://datahub.io/opendatafortaxjustice/eucountrydatawb/r/datapackage_zip.zip');
foreach ($package as $resource) {
    $resource->read(["cast" => false]);
}

包对象有一些有用的方法来访问和操作资源

$package = Package::load("tests/fixtures/multi_data_datapackage.json");
$package->resources();  // array of resource name => Resource object (see below for Resource class reference)
$package->getResource("first-resource");  // Resource object matching the given name
$package->removeResource("first-resource");
// add a tabular resource
$package->addResource("tabular-resource-name", [
    "profile" => "tabular-data-resource",
    "schema" => [
        "fields" => [
            ["name" => "id", "type" => "integer"],
            ["name" => "name", "type" => "string"]
        ]
    ],
    "path" => [
        "tests/fixtures/simple_tabular_data.csv",
    ]
]);

从头创建一个新的包

$package = Package::create([
    "name" => "datapackage-name",
    "profile" => "tabular-data-package"
]);
// add a resource
$package->addResource("resource-name", [
    "profile" => "tabular-data-resource", 
    "schema" => [
        "fields" => [
            ["name" => "id", "type" => "integer"],
            ["name" => "name", "type" => "string"]
        ]
    ],
    "path" => "tests/fixtures/simple_tabular_data.csv"
]);
// save the package descriptor to a file
$package->saveDescriptor("datapackage.json");

将整个data package(包括任何本地数据)保存到zip文件中

$package->save("datapackage.zip");

资源

可以从Package中访问资源对象,如上所述

$resource = $package->getResource("resource-name")

或直接实例化

use frictionlessdata\datapackage\Resource;
$resource = Resource::create([
    "name" => "my-resource",
    "profile" => "tabular-data-resource",
    "path" => "tests/fixtures/simple_tabular_data.csv",
    "schema" => ["fields" => [["name" => "id", "type" => "integer"], ["name" => "name", "type" => "string"]]]
]);

遍历或读取资源将生成所有路径或数据元素的组合行

foreach ($resource as $row) {};  // iterating
$resource->read();  // get all the data as an array

贡献

请阅读贡献指南:如何贡献