caco / un-markdown
一个PHP库,用于从文本中移除Markdown“格式”。
0.0.1
2020-10-03 20:56 UTC
Requires (Dev)
- devster/ubench: ^2.1
- erusev/parsedown: ^1.7
- phpunit/phpunit: ^8.5
This package is auto-updated.
Last update: 2024-09-08 08:45:55 UTC
README
一个简单的PHP库,用于将Markdown转换回纯文本。
该库的目的是将Markdown转换为纯文本,例如聊天通知等。
- 它不会像以下调用那样丢失整个文本结构:
strip_tags(Parsedown::instance()->text('…'))
- 它的性能比使用具有AST支持的完整功能Markdown解析器更好。
- 使用前缀装饰一些内容,例如
- 🔗 用于链接
- 💬 用于评论
- • 用于无序列表项。
- 🏍️ 仅使用正则表达式进行文本转换。
- 提供与GitHub Flavored Markdown规范的良好兼容性(非100% 🤷♂️)。
- 采用超过65个单元测试 💪 和370多个断言进行测试驱动。
使用方法
基本使用与使用常见的Markdown解析库一样简单。
$markdownRemover = new MarkdownRemover(); echo $markdownRemover->strip('Hello **World**');
将生成 Hello World
。
在构建实例时可以轻松更改前缀。
$markdownRemover = new MarkdownRemover('"Link prefix" ', '"Image prefix️" ', '"Comment prefix" ', '… '); echo $markdownRemover->strip('Wow look at this link [example.com](https://example.com/) isn't it **awesome**?');
将生成 Wow look at this link example.com "Link prefix" https://example.com/ isn't it awesome?
。
可以轻松更改特定规则,删除或替换它们。
$classUnderTest = new MarkdownRemover(); $classUnderTest ->getReplacements()[8] ->setReplace(function ($matches) { return ReEmphasis::toBold($matches[2]); }); $classUnderTest ->getReplacements()[9] ->setReplace(function ($matches) { return ReEmphasis::toItalic($matches[2]); }); $classUnderTest ->getReplacements()[16] ->setReplace(function ($matches) { return ReEmphasis::toMonospaced($matches[1]); }); echo $classUnderTest->strip('**Test** *italic* `replacement`');
将生成 𝗧𝗲𝘀𝘁 𝘪𝘵𝘢𝘭𝘪𝘤 𝚛𝚎𝚙𝚕𝚊𝚌𝚎𝚖𝚎𝚗𝚝
;
转换示例
以下Markdown
# Headings Heading with `#` or as setext are supported. Alt-H1 (Setext) ====== Alt-H2 (Setext) ------ ## Emphasis, Strong emphasis & Strikethrough Emphasis, aka italics, with *asterisks* or _underscores_. Strong emphasis, aka bold, with **asterisks** or __underscores__. Combined emphasis with **asterisks and _underscores_**. Strikethrough uses two tildes. ~~Scratch this.~~ ### Lists 1. Ordered lists gets 2. passed as they are 4. As you can see the numbering 5. is not correct * Unordered + lists - gets + converted - to the bullet UTF-8 char - [ ] Task - [x] List - [ ] are - [X] supported! #### Links and images [I'm an inline-style link](https://www.google.com) [I'm an inline-style link with title](https://www.google.com "Google's Homepage") [I'm a reference-style link][Arbitrary case-insensitive reference text] [I'm a relative reference to a repository file](../blob/master/LICENSE) [You can use numbers for reference-style link definitions][1]  ![alt text][logo] ##### Code Inline `code` and block code is supported, too. \`\`\`no-highlight This is a code block, **MD** is ~~not~~ *interpreted*. \`\`\` ###### Blockquotes > Blockquotes are very handy in **email** to emulate reply text. >> This line is part of the same quote. ###### Escaping You can use the \\ character to escape MD. So you can escape the asterisk in strong e.g. \\\* to archive this \*\*Not strong\*\*. ###### Thematic breaks aka <hr> All hr gets stripped, you should not see any chars below this line: --- *** ___ [arbitrary case-insensitive reference text]: https://www.mozilla.org [1]: http://slashdot.org [logo]: https://github.com/adam-p/markdown-here/raw/master/src/common/images/icon48.png "Logo Title Text 2"
将被转换为以下纯文本
Headings
Heading with # or as setext are supported.
Alt-H1 (Setext)
Alt-H2 (Setext)
Emphasis, Strong emphasis & Strikethrough
Emphasis, aka italics, with asterisks or underscores.
Strong emphasis, aka bold, with asterisks or underscores.
Combined emphasis with asterisks and underscores.
Strikethrough uses two tildes. Scratch this.
Lists
1. Ordered lists gets
2. passed as they are
4. As you can see the numbering
5. is not correct
• Unordered
• lists
• gets
• converted
• to the bullet UTF-8 char
• ⭕ Task
• ❌ List
• ⭕ are
• ❌ supported!
Links and images
I'm an inline-style link 🔗 https://www.google.com
I'm an inline-style link with title 🔗 https://www.google.com "Google's Homepage"
I'm a reference-style link 🔗 https://www.mozilla.org
I'm a relative reference to a repository file 🔗 ../blob/master/LICENSE
You can use numbers for reference-style link definitions 🔗 http://slashdot.org
🖼️ alt text
🖼️ alt text
Code
Inline code and block code is supported, too.
This is a code block, **MD** is ~~not~~ *interpreted*.
Blockquotes
💬 Blockquotes are very handy in email to emulate reply text.
💬 This line is part of the same quote.
Escaping
You can use the \ character to escape MD. So you can escape the asterisk in strong e.g. \* to archive this **Not strong**.
Thematic breaks aka <hr>
All hr gets stripped, you should not see any chars below this line: