hegland / text-parser
文本解析器。它允许您从一个给定文本中提取所需的部分。
2.0.1
2021-04-07 21:25 UTC
Requires
- php: >=7.0
Requires (Dev)
- phpunit/phpunit: ~5.3.0
README
它允许您在给定文本中裁剪所需的部分
示例
查找一个
<div class="id1"> <table> <thead> <tr> <th>company</th> <th>urls</th> <th>zipcode & city</th> </tr> </thead> <tbody> <tr> <td>Hegland GmbH</td> <td> <ul> <li>http://www.companylink1.ch</li> <li>http://www.companylink2.ch</li> <li>http://www.companylink3.ch</li> </ul> </td> <td>8400 Winterthur</td> </tr> </tbody> </table> </div> <ul> <li>http://www.link1.ch</li> <li>http://www.link2.ch</li> <li>http://www.link3.ch</li> </ul> <div class="id2"> ^^^^^^ <table> <thead> <tr> <th>name</th> <th>street</th> <th>zipcode & city</th> </tr> </thead> <tbody> ^^^^^^^ <tr> <td>Roger Hegland</td> ^^^^=============^^^^^ <td>Châtelstrasse 13</td> <td>8355 Aadorf</td> </tr> </tbody> </table> </div>
在以下示例中,我们得到名字 "Roger Hegland"
$name = Parser::findOne($text, '"id2">', '<tbody>', '<td>', '</td>'); /* result = (string) 'Roger Hegland' */
查找多个
请注意,第一个参数用于结束搜索。
在以下示例中,我们得到所有链接名称
<div class="id1"> <table> <thead> <tr> <th>company</th> <th>urls</th> <th>zipcode & city</th> </tr> </thead> <tbody> <tr> <td>Hegland GmbH</td> <td> <ul> <li><a href="http://www.companylink1.ch">companylink1</a></li> ^^^^ ^^============^^^^ <li><a href="http://www.companylink2.ch">companylink2</a></li> ^^^^ ^^============^^^^ <li><a href="http://www.companylink3.ch">companylink3</a></li> ^^^^ ^^============^^^^ </ul> </td> <td>8400 Winterthur</td> </tr> </tbody> </table> </div> <ul> <li><a href="http://www.link1.ch">link1</a></li> ^^^^ ^^=====^^^^ <li><a href="http://www.link2.ch">link2</a></li> ^^^^ ^^=====^^^^ <li><a href="http://www.link3.ch">link3</a></li> ^^^^ ^^=====^^^^ </ul> <div class="id2"> <table> <thead> <tr> <th>name</th> <th>street</th> <th>zipcode & city</th> </tr> </thead> <tbody> <tr> <td>Roger Hegland</td> <td>Châtelstrasse 13</td> <td>8355 Aadorf</td> </tr> </tbody> </table> </div>
Parser::findMany($text, '</a>', '<li>', '">' ); /* result = array [ 'companylink1', 'companylink2', 'companylink3', 'link1', 'link2', 'link3' ] */
如果您只需要在表格中链接名称,您可以这样做
<div class="id1"> <table> <thead> <tr> <th>company</th> <th>urls</th> <th>zipcode & city</th> </tr> </thead> <tbody> <tr> <td>Hegland GmbH</td> <td> <ul> <li><a href="http://www.companylink1.ch">companylink1</a></li> ^^^^ ^^============^^^^ <li><a href="http://www.companylink2.ch">companylink2</a></li> ^^^^ ^^============^^^^ <li><a href="http://www.companylink3.ch">companylink3</a></li> ^^^^ ^^============^^^^ </ul> </td> <td>8400 Winterthur</td> </tr> </tbody> </table> </div> <ul> <li><a href="http://www.link1.ch">link1</a></li> <li><a href="http://www.link2.ch">link2</a></li> <li><a href="http://www.link3.ch">link3</a></li> </ul> <div class="id2"> <table> <thead> <tr> <th>name</th> <th>street</th> <th>zipcode & city</th> </tr> </thead> <tbody> <tr> <td>Roger Hegland</td> <td>Châtelstrasse 13</td> <td>8355 Aadorf</td> </tr> </tbody> </table> </div>
$text = Parser::findOne($text, '<tbody>', '</tbody>' ); Parser::findMany($text, '</a>', '<li>', '">' ); /* result = array [ 'companylink1', 'companylink2', 'companylink3', ] */