bramus/mixed-content-scan

扫描您的启用了HTTPS的网站以查找混合内容

资助包维护!
bramus

2.9 2019-02-27 13:49 UTC

This package is auto-updated.

Last update: 2024-08-28 01:43:12 UTC


README

Source Version Downloads License

扫描您的启用了HTTPS的网站以查找混合内容

由Bramus构建!(https://www.bram.us/) 以及 贡献者

关于

Mixed Content Scan 是一个命令行脚本,可以爬取并扫描启用了HTTPS的网站以查找混合内容。

脚本从指定的URL开始,然后开始处理它

  • 检查所有包含的 img[src|srcset|data-src]iframe[src]script[src]link[href][rel="stylesheet"]object[data]form[action]embed[src]video[src]audio[src]source[src|srcset]params[name="movie"][value] 元素是否为混合内容
  • 所有指向相同或更深级别的 a[href] 元素都会连续处理以检查混合内容。

安装

可以使用 Composer 进行安装

composer global require bramus/mixed-content-scan:~2.9

初识Composer? 它是PHP依赖管理的命令行工具。在Linux/Unix/OSX上,您需要 下载并运行安装脚本(推荐) 连续 composer.phar 移动到全局位置。在Windows上,您需要 运行安装程序

用法

从CLI运行此脚本,例如

$ mixed-content-scan https://www.bram.us/

脚本本身将开始扫描并在运行时提供反馈。当找到混合内容时,将显示在屏幕上导致混合内容警告的URL

$ mixed-content-scan https://www.bram.us/
[2015-01-07 12:54:20] MCS.NOTICE: Scanning https://www.bram.us/ [] []
[2015-01-07 12:54:21] MCS.INFO: 00000 - https://www.bram.us/ [] []
[2015-01-07 12:54:22] MCS.INFO: 00001 - https://www.bram.us/projects/ [] []
[2015-01-07 12:54:22] MCS.INFO: 00002 - https://www.bram.us/projects/mint-custom-title/ [] []
[2015-01-07 12:54:23] MCS.INFO: 00003 - https://www.bram.us/projects/bramusicq/ [] []
[2015-01-07 12:54:24] MCS.INFO: 00004 - https://www.bram.us/projects/gm_bramus/ [] []
[2015-01-07 12:54:24] MCS.INFO: 00005 - https://www.bram.us/projects/js_bramus/ [] []
[2015-01-07 12:54:26] MCS.INFO: 00006 - https://www.bram.us/projects/js_bramus/jsprogressbarhandler/ [] []
[2015-01-07 12:54:27] MCS.INFO: 00007 - https://www.bram.us/projects/js_bramus/lazierload/ [] []
[2015-01-07 12:54:27] MCS.INFO: 00008 - https://www.bram.us/projects/the-box-office/ [] []
[2015-01-07 12:54:28] MCS.INFO: 00009 - https://www.bram.us/projects/tinymce-plugins/ [] []
[2015-01-07 12:54:29] MCS.INFO: 00010 - https://www.bram.us/projects/tinymce-plugins/tinymce-classes-and-ids-plugin-bramus_cssextras/ [] []
[2015-01-07 12:54:30] MCS.INFO: 00011 - https://www.bram.us/projects/flashlightboxinjector/ [] []

...

[2015-01-07 12:54:45] MCS.INFO: 00036 - https://www.bram.us/2007/06/04/accessible-expanding-and-collapsing-menu/ [] []
[2015-01-07 12:54:45] MCS.ERROR: 00037 - https://www.bram.us/demo/projects/jsprogressbarhandler/ [] []
[2015-01-07 12:54:45] MCS.WARNING: https://#/urchin.js [] []
[2015-01-07 12:54:46] MCS.INFO: 00038 - https://www.bram.us/2008/07/11/ror-progress-bar-helper/ [] []
[2015-01-07 12:54:46] MCS.INFO: 00039 - https://www.bram.us/2008/11/10/jsprogressbarhandler-033/ [] []
[2015-01-07 12:54:47] MCS.ERROR: 00040 - https://www.bram.us/demo/projects/lazierload/ [] []
[2015-01-07 12:54:47] MCS.WARNING: http://farm2.static.flickr.com/1212/1285026452_0aeb38b6e6.jpg [] []
[2015-01-07 12:54:47] MCS.WARNING: http://farm2.static.flickr.com/1074/1273115418_a77357040a.jpg [] []
[2015-01-07 12:54:47] MCS.WARNING: http://farm2.static.flickr.com/1096/1273106588_91f7a736c6.jpg [] []
[2015-01-07 12:54:47] MCS.WARNING: http://farm2.static.flickr.com/1324/1216309045_31ca82f9d9.jpg [] []
[2015-01-07 12:54:47] MCS.WARNING: http://farm2.static.flickr.com/1262/1217169586_e4b2bfa7df.jpg [] []
[2015-01-07 12:54:47] MCS.WARNING: http://farm2.static.flickr.com/1149/1216304291_63fd48d9c4.jpg [] []
[2015-01-07 12:54:47] MCS.WARNING: http://farm2.static.flickr.com/1366/1216301505_51b3c590ff.jpg [] []
[2015-01-07 12:54:47] MCS.WARNING: http://farm2.static.flickr.com/1184/1216299847_c57975bed2.jpg [] []
[2015-01-07 12:54:47] MCS.WARNING: http://farm2.static.flickr.com/1085/1217158084_a9b059d25b.jpg [] []
[2015-01-07 12:54:47] MCS.WARNING: http://farm2.static.flickr.com/1040/1216293529_3b7c044815.jpg [] []
[2015-01-07 12:54:47] MCS.WARNING: http://farm2.static.flickr.com/1029/1084232736_5b8c023f46.jpg [] []
[2015-01-07 12:54:47] MCS.WARNING: http://farm2.static.flickr.com/1318/1043062251_17071a8cc7.jpg [] []
[2015-01-07 12:54:47] MCS.WARNING: http://farm2.static.flickr.com/1221/1043059543_05713e6156.jpg [] []
[2015-01-07 12:54:47] MCS.WARNING: https://#/urchin.js [] []
[2015-01-07 12:54:47] MCS.INFO: 00041 - https://www.bram.us/2011/09/30/css-regions-and-css-exclusions/ [] []
[2015-01-07 12:54:47] MCS.INFO: 00042 - https://www.bram.us/2014/06/04/good-looking-shapes-gallery/ [] []

...

Mixed Content Scan使用ANSI颜色,由bramus/ansi-php提供,因此可以根据颜色轻松识别错误。

高级用法 / CLI选项

Mixed Content Scan支持多个CLI选项,可以修改其行为

  • --output=path/to/file:输出结果的文件。默认为php://stdout(=显示在屏幕上)。
  • --format=ansi|no-ansi|json:定义用于输出结果的格式化程序
    • ansi (默认):ANSI颜色行格式化程序
    • no-ansi:Monolog行格式化程序
    • json:Monolog JSON格式化程序
  • --no-crawl:不要爬取已扫描的页面以查找新页面
  • --no-check-certificate:不要检查证书的有效性(例如,允许自签名或缺失的证书)
  • --timeout=value-in-milliseconds:等待每个请求完成的时长。默认为10000ms。
  • --delay=value-in-seconds:每次请求之间的等待时长。默认为0s。
  • --input=path/to/file:指定一个包含链接列表的文件作为源,而不是解析传入的URL。自动启用--no-crawl
  • --ignore=path/to/file:包含要忽略的URL模式的文件。有关如何构建此文件的更多信息,请参阅下文的忽略链接
  • --loglevel=level:用于日志记录的Monolog日志级别。默认为200(= info)。支持输入数字值和字符串(小写)值。有关更多信息,请参阅Monolog日志级别
  • --user-agent='user-agent':设置在爬取时使用的用户代理。

示例: mixed-content-scan https://www.bram.us/ --ignore=./wordpress.txt --output=./results.txt --format=no-ansi

错误处理

Mixed Content Scan 内部使用 Curl 进行请求。如果遇到错误(例如连接丢失),错误信息将显示在屏幕上

...
[2015-01-07 12:56:43] MCS.INFO: 00003 - https://www.bram.us/projects/bramusicq/ [] []
[2015-01-07 12:56:53] MCS.CRITICAL: cURL Error (28): SSL connection timeout [] []
...

忽略链接

可以定义一个要忽略的模式的列表。为此,创建一个包含每行一个要忽略的 PCRE 模式的文本文件。使用 --ignore 选项传入该文件的路径。以 # 开头的行被视为注释,因此将被忽略。

对于 WordPress 安装,忽略模式文件(与 Mixed Content Scan 一起在 ignorepattens/wordpress.txt 中分发)是这样的

# Paginated Overview Links
^{$rootUrl}/page/(\d+)/$

# Single Post Links
# ^{$rootUrl}/(\d+)/(\d+)/

# Tag Overview Links
^{$rootUrl}/tag/

# Author Overview Links
^{$rootUrl}/author/

# Category Overview Links
^{$rootUrl}/category/

# Monthly Overview Links
^{$rootUrl}/(\d+)/(\d+)/$

# Year Overview Links
^{$rootUrl}/(\d+)/$

# Comment Subscription Link
^{$rootUrl}/comment-subscriptions

# Wordpress Core File Links
^{$rootUrl}/(.*)?wp\-(.*)\.php

# Archive Links
^{$rootUrl}/archive/

# Replyto Links
\?replytocom\=

每个模式中的 {$rootUrl} 标记将被传递给脚本的(根)URL 替换。

注意:可能会用到PHP PCRE 技巧表

已知问题

Mixed Content Scan

  • 不考虑 <base href="..."> 标签(但是谁会使用那个,对吧?)
  • 不会扫描链接的 .css.js 文件本身是否存在混合内容
  • 不会扫描内联 <script><style> 中的混合内容

当你遇到问题时,请提交一个 issue (或者修复它并执行一个 pull request ;))