codeinc/pdf2txt-client

pdf2txt服务的PHP客户端

v1.5 2024-02-24 01:28 UTC

This package is auto-updated.

Last update: 2024-09-24 02:47:31 UTC


README

此仓库包含一个PHP 8.2+库,用于使用pdf2txt服务将PDF文件转换为文本。

安装

安装此库的推荐方法是使用Composer

composer require codeinc/pdf2txt-client

使用方法

此客户端需要一个正在运行的pdf2txt服务实例。服务可以通过Docker在本地运行或部署到服务器上。

示例

从本地文件提取文本

use CodeInc\Pdf2TxtClient\Pdf2TxtClient;
use CodeInc\Pdf2TxtClient\Exception;

$apiBaseUri = 'http://localhost:3000/';
$localPdfPath = '/path/to/local/file.pdf';

try {
    // convert
    $client = new Pdf2TxtClient($apiBaseUri);
    $stream = $client->extract(
        $client->createStreamFromFile($localPdfPath)
    );
    
    // display the text
    echo (string)$stream;
}
catch (Exception $e) {
    // handle exception
}

使用额外选项

use CodeInc\Pdf2TxtClient\Pdf2TxtClient;
use CodeInc\Pdf2TxtClient\ConvertOptions;
use CodeInc\Pdf2TxtClient\Format;

$apiBaseUri = 'http://localhost:3000/';
$localPdfPath = '/path/to/local/file.pdf';
$convertOption = new ConvertOptions(
    firstPage: 2,
    lastPage: 3,
    format: Format::json
);

try {
    $client = new Pdf2TxtClient($apiBaseUri);

    // convert 
    $jsonResponse = $client->extract(
        $client->createStreamFromFile($localPdfPath),
        $convertOption
    );
    
   // display the text in a JSON format
   $decodedJson = $client->processJsonResponse($jsonResponse);
   var_dump($decodedJson); 
}
catch (Exception $e) {
    // handle exception
}

将提取的文本保存到文件中

use CodeInc\Pdf2TxtClient\Pdf2TxtClient;
use CodeInc\Pdf2TxtClient\ConvertOptions;
use CodeInc\Pdf2TxtClient\Format;

$apiBaseUri = 'http://localhost:3000/';
$localPdfPath = '/path/to/local/file.pdf';
destinationTextPath = '/path/to/local/file.txt';

try {
    $client = new Pdf2TxtClient($apiBaseUri);

    // convert
    $stream = $client->extract(
        $client->createStreamFromFile($localPdfPath)
    );
    
    // save the text to a file
    $client->saveStreamToFile($stream, $destinationTextPath);
}
catch (Exception $e) {
    // handle exception
}

许可证

此库在MIT许可证下发布(请参阅LICENSE文件)。