codeinc / pdf2txt-client
pdf2txt服务的PHP客户端
v1.5
2024-02-24 01:28 UTC
Requires
- php: >=8.3
- php-http/discovery: ^1.19
- php-http/multipart-stream-builder: ^1.3
- psr/http-client: ^1.0
Requires (Dev)
- php-http/guzzle7-adapter: ^1.0
- phpunit/phpunit: ^11
- spatie/ray: ^1.41
README
此仓库包含一个PHP 8.2+库,用于使用pdf2txt服务将PDF文件转换为文本。
安装
安装此库的推荐方法是使用Composer
composer require codeinc/pdf2txt-client
使用方法
此客户端需要一个正在运行的pdf2txt服务实例。服务可以通过Docker在本地运行或部署到服务器上。
示例
从本地文件提取文本
use CodeInc\Pdf2TxtClient\Pdf2TxtClient; use CodeInc\Pdf2TxtClient\Exception; $apiBaseUri = 'http://localhost:3000/'; $localPdfPath = '/path/to/local/file.pdf'; try { // convert $client = new Pdf2TxtClient($apiBaseUri); $stream = $client->extract( $client->createStreamFromFile($localPdfPath) ); // display the text echo (string)$stream; } catch (Exception $e) { // handle exception }
使用额外选项
use CodeInc\Pdf2TxtClient\Pdf2TxtClient; use CodeInc\Pdf2TxtClient\ConvertOptions; use CodeInc\Pdf2TxtClient\Format; $apiBaseUri = 'http://localhost:3000/'; $localPdfPath = '/path/to/local/file.pdf'; $convertOption = new ConvertOptions( firstPage: 2, lastPage: 3, format: Format::json ); try { $client = new Pdf2TxtClient($apiBaseUri); // convert $jsonResponse = $client->extract( $client->createStreamFromFile($localPdfPath), $convertOption ); // display the text in a JSON format $decodedJson = $client->processJsonResponse($jsonResponse); var_dump($decodedJson); } catch (Exception $e) { // handle exception }
将提取的文本保存到文件中
use CodeInc\Pdf2TxtClient\Pdf2TxtClient; use CodeInc\Pdf2TxtClient\ConvertOptions; use CodeInc\Pdf2TxtClient\Format; $apiBaseUri = 'http://localhost:3000/'; $localPdfPath = '/path/to/local/file.pdf'; destinationTextPath = '/path/to/local/file.txt'; try { $client = new Pdf2TxtClient($apiBaseUri); // convert $stream = $client->extract( $client->createStreamFromFile($localPdfPath) ); // save the text to a file $client->saveStreamToFile($stream, $destinationTextPath); } catch (Exception $e) { // handle exception }
许可证
此库在MIT许可证下发布(请参阅LICENSE
文件)。