motaword/active-laravel

MotaWord Active 的 Laravel 中间件

1.0.0-rc18 2024-05-02 17:16 UTC

README

Latest Version on Packagist Total Downloads run-tests Check & fix styling

本软件包是为了让您快速集成 MotaWord Active 服务以本地化 Laravel 应用程序而开发的。

🙇 致谢

本软件包是从 codebar-ag/laravel-prerender 克隆而来。

💡 什么是 MotaWord Active?

🛠️ 要求

  • PHP: ^7.2 to ^8.2
  • Laravel: ^6 to ^10
  • MotaWord Active 访问

⚙️ 安装

您可以通过 composer 安装此软件包

composer require motaword/active

然后,将以下内容添加到您的 .env 文件中

MOTAWORD_ACTIVE_TOKEN=active token from your MotaWord dashboard
MOTAWORD_ACTIVE_PROJECT_ID=project ID from your MotaWord dashboard
MOTAWORD_ACTIVE_WIDGET_ID=widget ID from your MotaWord dashboard 

就这样。来自爬虫的每个 GET 请求都将转发到 MotaWord 的 Active 服务。

✋ 禁用 Active 服务(SEO + CDN)

您可以通过将以下内容添加到您的 .env 文件中来禁用 Active 服务

MOTAWORD_ACTIVE_SERVE_ENABLE=false

这可能对您的本地开发环境很有用。

✏️ 工作原理

  1. 中间件会检查我们是否应该显示预渲染页面
    1. 中间件会检查请求是否来自爬虫(代理字符串或 _escaped_fragment_
    2. 中间件会检查我们是否正在请求资源(js、css 等...)
    3. (可选) 中间件会检查 URL 是否在白名单中
    4. (可选) 中间件会检查 URL 是否不在黑名单中
  2. 中间件会对页面的 HTML 发起 GET 请求到 Active 服务
  3. 将 HTML 返回给爬虫

🔧 配置文件

您可以使用以下命令发布配置文件

php artisan vendor:publish --provider="MotaWord\Active\MotaWordActiveServiceProvider"

之后您可以自定义 Whitelist/Blacklist。

这是已发布配置文件的内容

<?php

return [
    'active' => [
        /*
        |--------------------------------------------------------------------------
        | MotaWord Active Project Token
        |--------------------------------------------------------------------------
        |
        | Set your MotaWord Active token here. It will be sent via the X-MotaWord-Token header.
        |
        */
        'token' => env('MOTAWORD_ACTIVE_TOKEN'),
        /*
        |--------------------------------------------------------------------------
        | MotaWord Active Project ID
        |--------------------------------------------------------------------------
        |
        | Set your MotaWord Active project ID here. You can find this ID on your MotaWord dashboard, under Active > Configuration.
        |
        */
        'project_id' => env('MOTAWORD_ACTIVE_PROJECT_ID'),
        /*
        |--------------------------------------------------------------------------
        | MotaWord Active - Widget ID
        |--------------------------------------------------------------------------
        |
        | Set your MotaWord Active widget ID here.  You can find this ID on your MotaWord dashboard, under Active > Configuration.
        |
        */
        'widget_id' => env('MOTAWORD_ACTIVE_WIDGET_ID'),

        /*
        |--------------------------------------------------------------------------
        | Enable MotaWord Active
        |--------------------------------------------------------------------------
        |
        | Set this field to false to fully disable the MotaWord Active service.
        |
        */
        'serve_enable' => env('MOTAWORD_ACTIVE_SERVE_ENABLE', true),

        /*
        |--------------------------------------------------------------------------
        | Serve URL
        |--------------------------------------------------------------------------
        |
        | This is the base URL for our localization-specific CDN service, Active Serve.
        |
        */
        'serve_url' => env('MOTAWORD_ACTIVE_SERVE_URL', 'https://serve.motaword.com'),

        /*
        |--------------------------------------------------------------------------
        | Return soft HTTP status codes
        |--------------------------------------------------------------------------
        |
        | By default MotaWord Active returns soft HTTP codes. If you would like it to
        | return the real ones in case of Redirection (3xx) or status Not Found (404),
        | set this parameter to false.
        | Keep in mind that returning real HTTP codes requires appropriate meta tags
        | to be set. For more details, see github.com/motaword/active-laravel#httpheaders
        |
        */
        'soft_http_codes' => env('MOTAWORD_ACTIVE_SOFT_HTTP_STATUS_CODES', true),

        /*
        |--------------------------------------------------------------------------
        | MotaWord Active Whitelist
        |--------------------------------------------------------------------------
        |
        | Whitelist paths or patterns. You can use asterix syntax, or regular
        | expressions (without start and end markers). If a whitelist is supplied,
        | only url's containing a whitelist path will be prerendered. An empty
        | array means that all URIs will pass this filter. Note that this is the
        | full request URI, so including starting slash and query parameter string.
        | See github.com/JeroenNoten/Laravel-Prerender for an example.
        |
        */
        'whitelist' => [],

        /*
        |--------------------------------------------------------------------------
        | MotaWord Active Blacklist
        |--------------------------------------------------------------------------
        |
        | Blacklist paths to exclude. You can use asterix syntax, or regular
        | expressions (without start and end markers). If a blacklist is supplied,
        | all url's will be prerendered except ones containing a blacklist path.
        | By default, a set of asset extentions are included (this is actually only
        | necessary when you dynamically provide assets via routes). Note that this
        | is the full request URI, so including starting slash and query parameter
        | string. See github.com/JeroenNoten/Laravel-Prerender for an example.
        |
        */
        'blacklist' => [
            '*.js',
            '*.css',
            '*.xml',
            '*.less',
            '*.png',
            '*.jpg',
            '*.jpeg',
            '*.svg',
            '*.gif',
            '*.pdf',
            '*.doc',
            '*.txt',
            '*.ico',
            '*.rss',
            '*.zip',
            '*.mp3',
            '*.rar',
            '*.exe',
            '*.wmv',
            '*.doc',
            '*.avi',
            '*.ppt',
            '*.mpg',
            '*.mpeg',
            '*.tif',
            '*.wav',
            '*.mov',
            '*.psd',
            '*.ai',
            '*.xls',
            '*.mp4',
            '*.m4a',
            '*.swf',
            '*.dat',
            '*.dmg',
            '*.iso',
            '*.flv',
            '*.m4v',
            '*.torrent',
            '*.eot',
            '*.ttf',
            '*.otf',
            '*.woff',
            '*.woff2'
        ],

        /*
        |--------------------------------------------------------------------------
        | Crawler User Agents
        |--------------------------------------------------------------------------
        |
        | Requests from crawlers that do not support _escaped_fragment_ will
        | nevertheless be served with prerendered pages. You can customize
        | the list of crawlers here.
        |
        */
        'crawler_user_agents' => [
            'googlebot',
            'yahoo',
            'bingbot',
            'yandex',
            'baiduspider',
            'facebookexternalhit',
            'twitterbot',
            'rogerbot',
            'linkedinbot',
            'embedly',
            'bufferbot',
            'quora link preview',
            'showyoubot',
            'outbrain',
            'pinterest',
            'pinterest/0.',
            'developers.google.com/+/web/snippet',
            'www.google.com/webmasters/tools/richsnippets',
            'slackbot',
            'vkShare',
            'W3C_Validator',
            'redditbot',
            'Applebot',
            'WhatsApp',
            'flipboard',
            'tumblr',
            'bitlybot',
            'SkypeUriPreview',
            'nuzzel',
            'Discordbot',
            'Google Page Speed',
            'Qwantify'
        ],
    ],
];

🤍 白名单

白名单路径或模式。您可以使用通配符语法。如果提供了白名单,则仅包含白名单路径的 URL 将被发送到 Active 服务。空数组表示所有 URI 都会通过此过滤器。请注意,这是完整的请求 URI,因此包括起始斜杠和查询参数字符串。

// motaword.php:
'whitelist' => [
    '/frontend/*' // only Serve pages starting with '/frontend/'
],

🖤 黑名单

排除黑名单路径。您可以使用通配符语法。如果提供了黑名单,则除包含黑名单路径的 URL 之外的所有 URL 都将被发送到 Active 服务。默认情况下,包含一组资产扩展(这实际上仅在您通过路由动态提供资产时才需要)。请注意,这是完整的请求 URI,因此包括起始斜杠和查询参数字符串。

// motaword.php:
'blacklist' => [
    '/api/*' // do not Serve pages starting with '/api/'
],

🚧 本地测试

  1. 通过环境变量配置 MotaWord Active
MOTAWORD_ACTIVE_TOKEN=active token from your MotaWord dashboard
MOTAWORD_ACTIVE_PROJECT_ID=project ID from your MotaWord dashboard
MOTAWORD_ACTIVE_WIDGET_ID=widget ID from your MotaWord dashboard

2像搜索引擎爬虫一样测试您的页面。请确保将 URL 改为您的本地应用程序 URL

curl -A Googlebot http://127.0.0.1
  1. 🎉 就这样——您应该会看到 Active 服务的 HTML 输出!

📝 更新日志

有关最近更改的更多信息,请参阅 更新日志

✏️ 贡献

有关详细信息,请参阅 贡献指南

🧑‍💻 安全漏洞

有关如何报告安全漏洞,请参阅 我们的安全策略

🎭 许可证

MIT 许可证(MIT)。有关更多信息,请参阅 许可证文件