wyndow/fuzzywuzzy

基于Seatgeek的FuzzyWuzzy的模糊字符串匹配

v0.6.0 2017-04-25 21:20 UTC

This package is not auto-updated.

Last update: 2024-09-18 13:30:03 UTC


README

Build Status

PHP的模糊字符串匹配,基于同名Python库。

要求

  • PHP 5.4或更高版本

安装

使用 Composer

composer require wyndow/fuzzywuzzy

用法

use FuzzyWuzzy\Fuzz;
use FuzzyWuzzy\Process;

$fuzz = new Fuzz();
$process = new Process($fuzz); // $fuzz is optional here, and can be omitted.

简单比率

>>> $fuzz->ratio('this is a test', 'this is a test!')
=> 96

部分比率

>>> $fuzz->partialRatio('this is a test', 'this is a test!')
=> 100

标记排序比率

>>> $fuzz->ratio('fuzzy wuzzy was a bear', 'wuzzy fuzzy was a bear')
=> 90
>>> $fuzz->tokenSortRatio('fuzzy wuzzy was a bear', 'wuzzy fuzzy was a bear')
=> 100

标记集比率

>>> $fuzz->tokenSortRatio('fuzzy was a bear', 'fuzzy fuzzy was a bear')
=> 84
>>> $fuzz->tokenSetRatio('fuzzy was a bear', 'fuzzy fuzzy was a bear')
=> 100

过程

>>> $choices = ['Atlanta Falcons', 'New York Jets', 'New York Giants', 'Dallas Cowboys']
>>> $c = $process->extract('new york jets', $choices, null, null, 2)
=> FuzzyWuzzy\Collection {#205}
>>> $c->toArray()
=> [
     [
       "New York Jets",
       100,
     ],
     [
       "New York Giants",
       78,
     ],
   ]
>>> $process->extractOne('cowboys', $choices)
=> [
     "Dallas Cowboys",
     90,
   ]

您还可以向 extractOne 传递其他参数,使其使用特定的评分器。

>>> $process->extractOne('cowbell', $choices, null, [$fuzz, 'ratio'])
=> [
     "Dallas Cowboys",
     38,
   ]
>>> $process->extractOne('cowbell', $choices, null, [$fuzz, 'tokenSetRatio'])
=> [
     "Dallas Cowboys",
     57,
   ]

注意事项

Unicode字符串可能会产生意外的结果。我们打算在未来版本中修复此问题。

进一步阅读