landrok / language-detector
一个快速且可靠的PHP库,用于检测语言
1.4.0
2023-12-18 21:52 UTC
Requires
- php: >=7.4
- ext-mbstring: *
- webmozart/assert: ^1.2
Requires (Dev)
- phpunit/phpunit: >=6
This package is auto-updated.
Last update: 2024-09-21 14:39:07 UTC
README
LanguageDetector 是一个PHP库,可以从文本字符串中检测语言。
目录
功能
- 支持超过50种语言,包括克林贡语
- 非常快速,无需数据库
- 包含2MB数据集的包
- 学习步骤已完成,库已准备好使用
- 代码小巧,占用空间小
- N-gram算法
- 支持PHP 5.4+, 7+和8+以及HHVM。最新版本1.4.x仅支持PHP>=7.4
安装
composer require landrok/language-detector
快速使用
检测语言
实例化一个检测器,传递一个文本并获取检测到的语言。
require_once 'vendor/autoload.php'; $text = 'My tailor is rich and Alison is in the kitchen with Bob.'; $detector = new LanguageDetector\LanguageDetector(); $language = $detector->evaluate($text)->getLanguage(); echo $language; // Prints something like 'en'
实例化后,您可以测试多个文本。
require_once 'vendor/autoload.php'; // An array of texts to evaluate $texts = [ 'My tailor is rich and Alison is in the kitchen with Bob.', 'Mon tailleur est riche et Alison est dans la cuisine avec Bob' ]; $detector = new LanguageDetector\LanguageDetector(); foreach ($texts as $key => $text) { $language = $detector->evaluate($text)->getLanguage(); echo sprintf( "Text %d, language=%s\n", $key, $language ); }
输出可能如下所示
Text 0, language=en Text 1, language=fr
此外,您可以将LanguageDetector实例用作字符串。
require_once 'vendor/autoload.php'; $text = 'My tailor is rich and Alison is in the kitchen with Bob.'; $detector = new LanguageDetector\LanguageDetector(); echo $detector->evaluate($text); // Prints something like 'en' echo $detector; // Prints something like 'en' after an evaluate()
API 方法
evaluate()
类型 \LanguageDetector\LanguageDetector
它对给定文本进行评估。
示例
在执行evaluate()
后,结果将被存储并可供以后使用。
$detector->evaluate('My tailor is rich and Alison is in the kitchen with Bob.'); // Then you have access to the detected language $detector->getLanguage(); // Returns 'en'
您可以一行调用。
$detector->evaluate('My tailor is rich and Alison is in the kitchen with Bob.') ->getLanguage(); // Returns 'en'
可以直接打印evaluate()
的输出。
// Returns 'en' echo $detector->evaluate('My tailor is rich and Alison is in the kitchen with Bob.');
getLanguage()
类型 string
检测到的语言
示例
$detector->getLanguage(); // Returns 'en'
getLanguages()
类型 array
将进行评估的加载模型的列表。
示例
$detector->getLanguages(); // Returns something like ['de', 'en', 'fr']
getScores()
类型 array
所有评估语言的分数列表。
示例
$detector->getScores(); // Returns something like Array ( [en] => 0.43950135722745 [nl] => 0.40898789832569 [...] [ja] => 0 [fa] => 0 )
getSupportedLanguages()
类型 array
将进行评估的支持的语言列表。
示例
$detector->getSupportedLanguages(); // Returns something like Array ( [0] => af [1] => ar [...] [51] => zh-cn [52] => zh-tw )
getText()
类型 string
返回最后一个已评估的字符串
示例
$detector->getText(); // Returns 'My tailor is rich and Alison is in the kitchen with Bob.'
选项
类型 \LanguageDetector\LanguageDetector
为了更好的性能,可以明确指定加载的模型。
示例
$text = 'My tailor is rich and Alison is in the kitchen with Bob.'; $detector = new LanguageDetector(null, ['en', 'fr', 'de']); $language = $detector->evaluate($text); echo $language; // Prints something like 'en'
仅适用于单行
类型 \LanguageDetector\LanguageDetector
通过在detect()方法上使用静态调用,您可以在一行内对给定文本进行评估。
示例
echo LanguageDetector\LanguageDetector::detect( 'My tailor is rich and Alison is in the kitchen with Bob.' ); // Returns 'en'
您可以使用所有API方法。
$detector = LanguageDetector\LanguageDetector::detect( 'My tailor is rich and Alison is in the kitchen with Bob.' ); // en echo $detector; // en echo $detector->getLanguage(); // An array of all scores, see API method print_r($detector->getScores()); // An array of all supported languages, see API method print_r($detector->getSupportedLanguages()); // The last evaluated string echo $detector->getText(); // Limit loaded languages for even better performance echo LanguageDetector\LanguageDetector::detect( 'My tailor is rich and Alison is in the kitchen with Bob.', ['en', 'de', 'fr', 'es'] ); // en