mbeurel / php-lemmatizer
PHP版本的TreeTagger库包装器
0.2.0
2022-10-20 05:01 UTC
Requires
- php: ^7.4 | ^8.0
- ext-json: *
- symfony/process: ^5.4
Requires (Dev)
- symfony/debug-bundle: ^5.4
- symfony/var-dumper: ^5.4
README
基于TreeTagger的PHP简单词干提取工具。
安装TreeTagger库
查看TreeTagger 网站
安装php-lemmatizer
您可以使用Composer进行安装
composer require mbeurel/php-lemmatizer
示例
示例脚本位于单独的存储库中 php-lemmatizer/example。
示例代码
include "vendor/autoload.php"; use PhpTreeTagger\TreeTagger; $treeTaggerPath = __DIR__."/treeTagger"; // Library TreeTagger path try { // Init library $treeTagger = new TreeTagger("french", array( "treeTaggerPath" => $treeTaggerPath, // Path to TreeTagger Library "debug" => false, // View Debug "wordUnique" => true, // Keep only one occurrence of the word "wordRemoveAccent" => true, // Remove all accent in word "nbProcess" => $nbProcess // Number of processes executed at the same time ) ); // Remove type in words $treeTagger->setCleanTypeWords( array( "PRO:PER", "DET:ART", "DET:POS", "SENT", "PRP" ) ); // Lemmatizer String or Array parameters, to array => ["La lemmatisation désigne un traitement lexical", "apporté à un texte en vue de son analyse"] $result = $treeTagger->lemmatizer("La lemmatisation désigne un traitement lexical apporté à un texte en vue de son analyse."); // View result : var_dump($result); // $result = array( // 0 => array( // "value" => "lemmatisation designer traitement lexical apporter texte vue analyse", // "detail" => array( // 1 => array( // "source" => "lemmatisation", // "type" => "NOM", // "dest" => "lemmatisation" // ), // 2 => array( // "source" => "désigne", // "type" => "VER:pres", // "dest" => "désigner" // ), // 4 => array( // "source" => "traitement", // "type" => "NOM", // "dest" => "traitement" // ), // 6 => array( // "source" => "apporté", // "type" => "VER:pper", // "dest" => "apporter" // ), // 7 => array( // "source" => "à", // "type" => "PRP", // "dest" => "à" // ), // 9 => array( // "source" => "texte", // "type" => "NOM", // "dest" => "texte" // ), // 10 => array( // "source" => "en", // "type" => "PRP", // "dest" => "en" // ), // 11 => array( // "source" => "vue", // "type" => "NOM", // "dest" => "vue" // ), // 12 => array( // "source" => "de", // "type" => "PRP", // "dest" => "de" // ), // 13 => array( // "source" => "son", // "type" => "DET:POS", // "dest" => "son" // ), // 14 => array( // "source" => "analyse", // "type" => "NOM", // "dest" => "analyse" // ), // 15 => array( // "source" => ".", // "type" => "SENT", // "dest" => "." // ) // } // } // } } catch(\Exception $e) { echo $e; }
致谢
由Matthieu Beurel创建。由Yipikai赞助。