tanuel / tokenizer
PHP的标记化器
v0.1.1
2019-10-06 08:56 UTC
Requires
- php: >=7.1
Requires (Dev)
- friendsofphp/php-cs-fixer: ^2.15
- phpunit/phpunit: ^8.3
This package is auto-updated.
Last update: 2024-09-09 07:37:51 UTC
README
PHP的轻量级零依赖标记化器
这是一个简单但功能强大的标记化器,用PHP编写,您可以为它传递自己的标记定义。
设置
composer install tanuel/tokenizer
使用方法
提示:查看单元测试以获取示例
1. 使用正则表达式模式创建标记定义
<?php // Token Definitions must start with T_, otherwise they won't be interpreted as Tokens use \Tanuel\Tokenizer\AbstractTokenDefinition; // The BaseTokenInterface is optional, but will provide some basic utilities use \Tanuel\Tokenizer\BaseTokenInterface; class T extends AbstractTokenDefinition implements BaseTokenInterface { /** * A single dollar sign * @pattern \$ */ const T_DOLLAR = 'T_DOLLAR'; /** * A range of digits * @pattern \d+ */ const T_DIGITS = 'T_DIGITS'; }
2. 创建标记化器并获取标记
<?php // using the token definition from above $tokenizer = new \Tanuel\Tokenizer\Tokenizer('string to tokenize $ 123', T::class); // get all tokens $t = $tokenizer->getAll(); // reset internal pointer $tokenizer->reset(); // get next token, ignoring leading whitespaces and linebreaks (T_WHITESPACE => \s+) $token = $tokenizer->next(); // get next token, don't ignore leading whitespaces or linebreaks $token = $tokenizer->next(false); // expect a certain set of tokens, else throw an exception try { $token = $tokenizer->nextOf([T::T_DOLLAR, T::T_DIGITS]); } catch (\Tanuel\Tokenizer\TokenizerException $e) { // do something with exception } // forecast next token without moving the pointer forward $forecast = $tokenizer->forecast(); // also works with expecting a certain token $forecast = $tokenizer->forecastOf([T::T_DOLLAR, T::T_DIGITS]);
3. 使用标记
<?php /** @var $token \Tanuel\Tokenizer\Token */ // get the matched value $token->getValue(); // check if the token matches a certain name $token->eq('T_STRING', 'T_DOLLAR'); // true if it is a T_STRING or T_DOLLAR // get the token definition info $token->getDefinition()->getName(); // e.g. T_STRING $token->getDefinition()->getPattern(); // e.g. \w+ for T_STRING // get metainfo $token->getLine(); $token->getEndLine(); $token->getColumn(); $token->getEndColumn(); $token->getLineCount();