harish81/laravel-duckdb

DuckDB CLI包装器,通过Laravel查询构建器与DuckDB数据库交互。

v1.0.3 2023-10-20 03:09 UTC

This package is auto-updated.

Last update: 2024-09-20 05:12:19 UTC


README

Latest Version on Packagist Total Downloads

https://github.com/duckdb/duckdb

支持我们

安装

您可以通过composer安装此软件包

composer require harish81/laravel-duckdb

使用方法

  • 连接
'connections' => [
    'my_duckdb' => [
        'driver' => 'duckdb',
        'cli_path' => env('DUCKDB_CLI_PATH', base_path('vendor/bin/duckdb')),
        //'dbfile' => env('DUCKDB_DB_FILE', '/tmp/duck_main.db'),
    ],
...
  • 示例
# Using DB facade
DB::connection('my_duckdb')
    ->table(base_path('genderdata.csv'))
    ->where('Gender', '=', 'M')
    ->limit(10)
    ->get();
# Using Raw queries
DB::connection('my_duckdb')
    ->select("select * from '".base_path('genderdata.csv')."' limit 5")
# Using Eloquent Model
class GenderDataModel extends \Harish\LaravelDuckdb\LaravelDuckdbModel
{
    protected $connection = 'my_duckdb';
    public function __construct()
    {
        $this->table = base_path('genderdata.csv');
    }
}
...
GenderDataModel::where('Gender','M')->first()

高级使用

您还可以安装duckdb扩展。

直接从s3文件查询数据。

  • database.php
'connections' => [
    'my_duckdb' => [
        'driver' => 'duckdb',
        'cli_path' => env('DUCKDB_CLI_PATH', base_path('vendor/bin/duckdb')),
        'cli_timeout' => 0, //0 to disable timeout, default to 1 Minute (60s)
        'dbfile' => env('DUCKDB_DB_FILE', storage_path('app/duckdb/duck_main.db')),
        'pre_queries' => [
            "SET s3_region='".env('AWS_DEFAULT_REGION')."'",
            "SET s3_access_key_id='".env('AWS_ACCESS_KEY_ID')."'",
            "SET s3_secret_access_key='".env('AWS_SECRET_ACCESS_KEY')."'",
        ],
        'extensions' => ['httpfs'],
    ],
    ...
  • 查询数据
DB::connection('my_duckdb')
  ->select("SELECT * FROM read_csv_auto('s3://my-bucket/test-datasets/example1/us-gender-data-2022.csv') LIMIT 10")

编写迁移

return new class extends Migration {
    protected $connection = 'my_duckdb';
    public function up(): void
    {
        DB::connection('my_duckdb')->statement('CREATE SEQUENCE people_sequence');
        Schema::create('people', function (Blueprint $table) {
            $table->id()->default(new \Illuminate\Database\Query\Expression("nextval('people_sequence')"));
            $table->string('name');
            $table->integer('age');
            $table->integer('rank');
            $table->timestamps();
        });
    }

    public function down(): void
    {
        Schema::dropIfExists('people');
        DB::connection('my_duckdb')->statement('DROP SEQUENCE people_sequence');
    }
};

只读连接 - 解决并发查询的方法。

  • database.php
    'connections' => [
        'my_duckdb' => [
            'driver' => 'duckdb',
            'cli_path' => env('DUCKDB_CLI_PATH', base_path('vendor/bin/duckdb')),
            'cli_timeout' => 0,
            'dbfile' => env('DUCKDB_DB_FILE', storage_path('app/duckdb/duck_main.db')),
            'schema' => 'main',
            'read_only' => true,
            'pre_queries' => [
                "SET s3_region='".env('AWS_DEFAULT_REGION')."'",
                "SET s3_access_key_id='".env('AWS_ACCESS_KEY_ID')."'",
                "SET s3_secret_access_key='".env('AWS_SECRET_ACCESS_KEY')."'",
            ],
            'extensions' => ['httpfs', 'postgres_scanner'],
        ],
        ...

测试

  • 生成测试数据
# Syntax: ./data-generator.sh <lines> <file-to-save.csv>
./data-generator.sh 100 _test-data/test.csv
./data-generator.sh 90000000 _test-data/test_big_file.csv
  • 运行测试用例
composer test

限制 & 常见问题解答

变更日志

请参阅CHANGELOG获取最近更改的更多信息。

贡献

请参阅CONTRIBUTING以获取详细信息。

安全漏洞

请查看我们的安全策略了解如何报告安全漏洞。

鸣谢

许可

MIT许可(MIT)。请参阅许可文件获取更多信息。