apemsel / attributed-string
PHP中快速易用的属性字符串类的集合。属性字符串可以为字符串中的每个字符设置多个属性,例如在文字处理软件和自然语言处理中使用。
v3.0.0
2024-03-29 16:02 UTC
Requires
- php: >=8.1
Requires (Dev)
- phpunit/phpunit: ^10.5
This package is auto-updated.
Last update: 2024-08-30 09:04:35 UTC
README
PHP中操作属性字符串类的集合。属性字符串是指每个字符都可以有多个属性的字符串。每个属性都是一个长度与字符串相等的位图或布尔数组。这种简单的数据结构可以用于实现许多有趣的功能,例如
- 在文字处理器中(例如,设置字符串的某个范围的“粗体”属性)进行文本装饰、颜色、字体等
- 语义文本分析系统(带有“动词”和“名词”等属性)
- 核心文本提取
示例
use apemsel\AttributedString\AttributedString; // ... $as = new AttributedString("The quick brown fox"); $as->setLength(10, 5, "color"); // "brown" has attribute "color" $as->is("color", 12); // == true $as->toHtml(); // "The quick <span class=\"color\">brown</span> fox" $as->setPattern("/[aeiou]/", "vowel"); // vowels have attribute "vowel" $as->getAttributes(12); // char at offset 12 has attributes ["color", "vowel"] $as->combineAttributes("and", "color", "vowel", "colored-vowel"); // also use "or", "not", "xor" to combine attributes $as->is("colored-vowel", 12); // "o" of "brown" is a color vowel ;-) $as->setSubstring("fox", "noun"); // all instances of "fox" have attribute "noun" $as->is("noun", 16); // true, char at offset 16 is part of a noun $as->searchAttribute("vowel"); // 2, first vowel starts at offset 2 $as->searchAttribute("vowel", 0, true); // [2, 1], first vowel starting at offset 0 is at offset 2 with length 1 // MutableAttributedString can be modified after creation and tries to be smart about the attributes $mas = new MutableAttributedString("The brown fox"); $mas->setLength(0, 13, "bold"); $mas->insert(4, "quick "); // "The quick brown fox"; $mas->is("bold", 6) // true, "quick" is now also bold since the inserted text was inside the "bold" attribute $mas->delete(10, 6) // "The quick fox" // TokenizedAttributedString tokenizes the given string, can set attributes by token // and maintains the tokens' offsets in the original string. $tas = new TokenizedAttributedString("The quick brown fox"); // tokenize using the default whitespace tokenizer $tas->getToken(2); // "brown" $tas->setTokenAttribute(2, "bold"); // "brown" is "bold" $tas->getTokenOffset(2); // 10, "brown" starts at offset 10 $tas->getTokenOffsets(); // [0, 4, 10, 16], start offsets of the tokens in the string $tas->setTokenRangeAttribute(2, 3, "underlined"); // set tokens 2 to 3 to "underlined" $tas->getAttributesAtToken(2); // ["bold", "underlined"] $tas->lowercaseTokens(); // convert tokens to lowercase $tas->setTokenDictionaryAttribute(["a", "an", "the"], "article"); // set all tokens contained in given dictionary to an attribute $tas->getAttributesAtToken(0); // "article"
安装
使用Composer(推荐)
composer require apemsel/AttributedString
文档
在doc/目录中查看生成的phpdoc API文档,或尝试http://htmlpreview.github.io/?https://raw.githubusercontent.com/apemsel/AttributedString/master/doc/index.html