helgesverre/milvus

Milvus Rest API 的 PHP 客户端

v0.1.0 2024-04-15 19:09 UTC

This package is auto-updated.

Last update: 2024-09-01 10:19:19 UTC


README

Milvus.io PHP API 客户端

Latest Version on Packagist Total Downloads

Milvus 是一个高度灵活、可靠且速度极快的开源向量数据库。它支持在万亿字节规模上添加、删除、更新和近似实时搜索向量。

此包是 Milvus v2.3.3 Restful API 的 API 客户端,基于 Saloon 包构建。

有关 Restful API 的文档可在 Milvus 网站 上找到,并且 OpenAPI 规范可在 此处 找到。

版本

(*) 但主要兼容,它们之间(据我所知)唯一的区别是新的向量 Upsert 端点,以及在向量搜索端点中的新参数(params.range_filterparams.radius)。

安装

您可以通过 composer 安装此包

composer require helgesverre/milvus

您可以使用以下命令发布配置文件

php artisan vendor:publish --tag="milvus-config"

这是已发布 config/milvus.php 文件的 内容

return [
    'token' => env('MILVUS_TOKEN'),
    'username' => env('MILVUS_USERNAME'),
    'password' => env('MILVUS_PASSWORD'),
    'host' => env('MILVUS_HOST', 'localhost'),
    'port' => env('MILVUS_PORT', '19530'),
];

用法

与 Laravel 一起使用

对于 Laravel 用户,您可以使用 Milvus 门面对象与 Milvus API 交互

use HelgeSverre\Milvus\Facades\Milvus;

// NOTE: dbName is optional and defaults to 'default', this is only relevant if you have multiple databases.
// List all collections in the 'default' database
Milvus::collections()->list(
    dbName: 'default'
);

// Create a new collection named 'documents' in the 'default' database with a specified dimension
Milvus::collections()->create(
    collectionName: 'documents',
    dimension: 128,
    dbname: 'default',
);

// Describe the structure and properties of the 'documents' collection in the 'default' database
Milvus::collections()->describe(
    collectionName: 'documents',
    dbname: 'default',
);

// Drop or delete the 'documents' collection from the 'default' database
Milvus::collections()->drop(
    collectionName: 'documents',
    dbname: 'default',
);


// Insert a new vector into the 'documents' collection with additional fields like title and link
// Note "vector" is a reserved field name and must be used for the vector data
Milvus::vector()->insert(
    collectionName: 'documents',
    data: [
        'vector' => [0.1, 0.2, 0.3 /* etc... */],
        "title" => "Document name here",
        "link" => "https://example.com/document-name-here",
    ]
);

// Search for similar vectors in the 'documents' collection using a provided vector
Milvus::vector()->search(
    collectionName: 'documents',
    vector: [0.1, 0.2, 0.3 /* etc... */],
);

// Delete a vector from the 'documents' collection using its ID
Milvus::vector()->delete(
    id: '123129471497',
    collectionName: 'documents'
);

// Query the 'documents' collection for specific documents using a filter condition and select specific output fields
Milvus::vector()->query(
    collectionName: 'documents',
    filter: "id in [443300716234671427, 443300716234671426]",
    outputFields: ["id", "title", "link"],
);

// Retrieve a specific vector from the 'documents' collection using its ID
Milvus::vector()->get(
    id: '123129471497',
    collectionName: 'documents'
);

// Update or insert a vector in the 'documents' collection. If the ID exists, it's updated; if not, a new entry is created
Milvus::vector()->upsert(
    collectionName: 'documents',
    data: [
        'id' => 123129471497,
        'vector' => [0.1, 0.2, 0.3 /* etc... */],
        "title" => "Document name here",
        "link" => "https://example.com/document-name-here",
    ]
);

不使用 Laravel

如果您不使用 Laravel,您需要创建 Milvus 类的新实例,并提供令牌或用户/密码、主机和端口。

<?php
// use HelgeSverre\Milvus\Facades\Milvus;
use HelgeSverre\Milvus\Milvus;

$milvus = new Milvus(
    token: "your-token",
    host: "localhost",
    port: "19530"
);


// Import the Milvus facade for easier access to Milvus functions

// NOTE: dbName is optional and defaults to 'default', this is only relevant if you have multiple databases.
// List all collections in the 'default' database
$milvus->collections()->list(
    dbName: 'default'
);

// Create a new collection named 'documents' in the 'default' database with a specified dimension
$milvus->collections()->create(
    collectionName: 'documents',
    dimension: 128,
    dbName: 'default',
);

// Describe the structure and properties of the 'documents' collection in the 'default' database
$milvus->collections()->describe(
    collectionName: 'documents',
    dbName: 'default',
);

// Drop or delete the 'documents' collection from the 'default' database
$milvus->collections()->drop(
    collectionName: 'documents',
    dbName: 'default',
);


// Insert a new vector into the 'documents' collection with additional fields like title and link
// Note "vector" is a reserved field name and must be used for the vector data
$milvus->vector()->insert(
    collectionName: 'documents',
    data: [
        'vector' => [0.1, 0.2, 0.3 /* etc... */],
        "title" => "Document name here",
        "link" => "https://example.com/document-name-here",
    ]
);

// Search for similar vectors in the 'documents' collection using a provided vector
$milvus->vector()->search(
    collectionName: 'documents',
    vector: [0.1, 0.2, 0.3 /* etc... */],
);

// Delete a vector from the 'documents' collection using its ID
$milvus->vector()->delete(
    id: '123129471497',
    collectionName: 'documents'
);

// Query the 'documents' collection for specific documents using a filter condition and select specific output fields
$milvus->vector()->query(
    collectionName: 'documents',
    filter: "id in [443300716234671427, 443300716234671426]",
    outputFields: ["id", "title", "link"],
);

// Retrieve a specific vector from the 'documents' collection using its ID
$milvus->vector()->get(
    id: '123129471497',
    collectionName: 'documents'
);

// Update or insert a vector in the 'documents' collection. If the ID exists, it's updated; if not, a new entry is created
$milvus->vector()->upsert(
    collectionName: 'documents',
    data: [
        'id' => 123129471497,
        'vector' => [0.1, 0.2, 0.3 /* etc... */],
        "title" => "Document name here",
        "link" => "https://example.com/document-name-here",
    ]
);

与 Zilliz Cloud 一起使用

如果您使用的是 Milvus 的托管版本,您需要指定以下主机和端口,以及您的 API 令牌

use HelgeSverre\Milvus\Milvus;

$milvus = new Milvus(
    token: "db_randomstringhere:passwordhere",
    host: 'https://in03-somerandomstring.api.gcp-us-west1.zillizcloud.com',
    port: '443'
);

示例:使用 Milvus 和 OpenAI 嵌入式进行语义搜索

此示例演示了如何在 Milvus 中使用由 OpenAI 生成的嵌入式执行语义搜索。

准备您的数据

首先,创建一个要索引的数据数组。在此示例中,我们将使用具有标题、摘要和标签的博客文章。

$blogPosts = [
    [
        'title' => 'Exploring Laravel',
        'summary' => 'A deep dive into Laravel frameworks...',
        'tags' => ['PHP', 'Laravel', 'Web Development']
    ],
       [
        'title' => 'Exploring Laravel',
        'summary' => 'A deep dive into Laravel frameworks, exploring its features and benefits for modern web development.',
        'tags' => ['PHP', 'Laravel', 'Web Development']
    ],
    [
        'title' => 'Introduction to React',
        'summary' => 'Understanding the basics of React and how it revolutionizes frontend development.',
        'tags' => ['JavaScript', 'React', 'Frontend']
    ],
    [
        'title' => 'Getting Started with Vue.js',
        'summary' => 'A beginner’s guide to building interactive web interfaces with Vue.js.',
        'tags' => ['JavaScript', 'Vue.js', 'Frontend']
    ],
];

生成嵌入式

使用 OpenAI 的嵌入式 API 将您的博客文章摘要转换为向量嵌入式。

$summaries = array_column($blogPosts, 'summary');
$embeddingsResponse = OpenAI::client('sk-your-openai-api-key')
    ->embeddings()
    ->create([
        'model' => 'text-embedding-ada-002',
        'input' => $summaries,
    ]);

foreach ($embeddingsResponse->embeddings as $embedding) {
    $blogPosts[$embedding->index]['vector'] = $embedding->embedding;
}

创建 Milvus 集合

在 Milvus 中创建一个集合以存储您的博客文章嵌入式,请注意,嵌入式的维度必须与 OpenAI 生成的嵌入式的维度匹配(如果您使用的是 text-embedding-ada-002 模型,则为 1536)。

$milvus = new Milvus(
    token: "your-token",
    host: "localhost",
    port: "19530"
);


$milvus->collections()->create(
    collectionName: 'blog_posts',
    dimension: 1536,
);

插入到 Milvus

将这些嵌入式以及其他博客文章数据插入到您的 Milvus 集合中。

$insertResponse = $milvus->vector()->insert('blog_posts', $blogPosts);

使用 OpenAI 创建搜索向量

为您的查询生成一个搜索向量,类似于您处理博客文章的方式。

$searchVectorResponse = OpenAI::client('sk-your-openai-api-key')
    ->embeddings()
    ->create([
        'model' => 'text-embedding-ada-002',
        'input' => 'laravel framework',
    ]);

$searchEmbedding = $searchVectorResponse->embeddings[0]->embedding;

使用嵌入式在 Milvus 中搜索

使用 Milvus 客户端使用生成的嵌入式进行搜索。

$searchResponse = $milvus->vector()->search(
    collectionName: 'blog_posts',
    vector: $searchEmbedding,
    limit: 3,
    outputFields: ['title', 'summary', 'tags']
);

// Output the search results
foreach ($searchResponse as $result) {
    echo "Title: " . $result['title'] . "\n";
    echo "Summary: " . $result['summary'] . "\n";
    echo "Tags: " . implode(', ', $result['tags']) . "\n\n";
}

在 Docker 中运行 Milvus

要快速开始使用 Milvus,您可以使用以下命令在 Docker 中运行它

# Download the docker-compose.yml file
wget https://github.com/milvus-io/milvus/releases/download/v2.3.3/milvus-standalone-docker-compose.yml -O docker-compose.yml

# Start Milvus
docker compose up -d

现在将在 https://:9091/healthz 上提供健康检查端点,Milvus API 将在 https://:19530 上提供。

要停止 Milvus,运行 docker compose down,要擦除所有数据,运行 docker compose down -v

有关更多详细信息,请参阅 使用 Docker Compose 安装 Milvus 单独版

对于生产负载,建议查看Zilliz.com,这是Milvus的开发者,并提供了Milvus的云托管版本 ☁️。

测试

cp .env.example .env

## Start a local Milvus instance, it takes awhile to boot up
docker compose up -d
 
composer test
composer analyse src

许可证

MIT许可证(MIT)。有关更多信息,请参阅许可证文件

免责声明

"Milvus®"和Milvus标志是Linux基金会(LF Projects, LLC)的注册商标。本包与Linux基金会无关联、未经授权或赞助。它是独立开发的,并在合理使用的范围内使用"Milvus"名称进行标识。所有商标和注册商标,包括"Milvus®",均为各自所有者的财产。"Milvus®"是Linux基金会的注册商标