codebar-ag/laravel-prerender

Laravel 中间件,用于即时渲染 JavaScript 渲染的页面以优化 SEO

v11.2 2024-08-09 04:19 UTC

This package is auto-updated.

Last update: 2024-09-09 04:29:06 UTC


README

Latest Version on Packagist Total Downloads run-tests Check & fix styling

此包是为了让您能够快速将 Prerender.io 服务集成到 Laravel 应用程序中而开发的。

🙇 致谢

此包是从 jeroennoten/Laravel-Prerender 复制而来,原作者为 jeroennotenCasperLaiTW 通过 2020 年 9 月 14 日的一个未合并的拉取请求(Pull-Request)提供了 Laravel 6、7 和 8 的兼容性。

💡 什么是 Prerender.io?

Prerender.io 中间件将检查每个请求,看它是否来自爬虫。如果是来自爬虫的请求,中间件将向 Prerender.io 发送请求以获取该页面的静态 HTML。如果不是,请求将继续到您的正常服务器路由。爬虫永远不知道您正在使用 Prerender.io,因为响应总是通过您的服务器。

Google 现在在其 动态渲染 文档中推荐您使用 Prerender.io!

🛠 要求

⚙️ 安装

您可以通过 composer 安装此包

composer require codebar-ag/laravel-prerender

如果您想使用 Prerender.io 服务,请将以下内容添加到您的 .env 文件中

PRERENDER_TOKEN=token

或者如果您使用的是自托管服务,请将服务器地址添加到 .env 文件中。

PRERENDER_URL=https://prerender.services

就是这样。来自爬虫的每个 GET 请求都将被渲染。

✋ 禁用服务

您可以通过将以下内容添加到您的 .env 文件中来禁用此服务

PRERENDER_ENABLE=false

这可能在您的本地开发环境中很有用。

✏️ 它是如何工作的

  1. 中间件检查我们是否应该显示一个预渲染的页面
    1. 中间件检查请求是否来自爬虫(代理字符串或 _escaped_fragment_
    2. 中间件检查我们是否请求一个资源(js、css 等...)
    3. (可选) 中间件检查 URL 是否在白名单中
    4. (可选) 中间件检查 URL 是否不在黑名单中
  2. 中间件向 prerender 服务 (phantomjs 服务器) 发送 GET 请求以获取页面的预渲染 HTML
  3. 将 HTML 返回给爬虫

🔧 配置文件

您可以使用以下命令发布配置文件

php artisan vendor:publish --provider="CodebarAg\LaravelPrerender\LaravelPrerenderServiceProvider"

之后,您可以自己自定义白名单/黑名单。

这是已发布配置文件的内容

<?php

return [

    /*
    |--------------------------------------------------------------------------
    | Enable Prerender
    |--------------------------------------------------------------------------
    |
    | Set this field to false to fully disable the prerender service. You
    | would probably override this in a local configuration, to disable
    | prerender on your local machine.
    |
    */

    'enable' => env('PRERENDER_ENABLE', true),

    /*
    |--------------------------------------------------------------------------
    | Prerender URL
    |--------------------------------------------------------------------------
    |
    | This is the prerender URL to the service that prerenders the pages.
    | By default, Prerender's hosted service on prerender.io is used
    | (https://service.prerender.io). But you can also set it to your
    | own server address.
    |
    */

    'prerender_url' => env('PRERENDER_URL', 'https://service.prerender.io'),

    /*
    |--------------------------------------------------------------------------
    | Return soft HTTP status codes
    |--------------------------------------------------------------------------
    |
    | By default Prerender returns soft HTTP codes. If you would like it to
    | return the real ones in case of Redirection (3xx) or status Not Found (404),
    | set this parameter to false.
    | Keep in mind that returning real HTTP codes requires appropriate meta tags
    | to be set. For more details, see github.com/prerender/prerender#httpheaders
    |
    */

    'prerender_soft_http_codes' => env('PRERENDER_SOFT_HTTP_STATUS_CODES', true),

    /*
    |--------------------------------------------------------------------------
    | Prerender Token
    |--------------------------------------------------------------------------
    |
    | If you use prerender.io as service, you need to set your prerender.io
    | token here. It will be sent via the X-Prerender-Token header. If
    | you do not provide a token, the header will not be added.
    |
    */

    'prerender_token' => env('PRERENDER_TOKEN'),

    /*
    |--------------------------------------------------------------------------
    | Prerender Whitelist
    |--------------------------------------------------------------------------
    |
    | Whitelist paths or patterns. You can use asterix syntax, or regular
    | expressions (without start and end markers). If a whitelist is supplied,
    | only url's containing a whitelist path will be prerendered. An empty
    | array means that all URIs will pass this filter. Note that this is the
    | full request URI, so including starting slash and query parameter string.
    | See github.com/JeroenNoten/Laravel-Prerender for an example.
    |
    */

    'whitelist' => [],

    /*
    |--------------------------------------------------------------------------
    | Prerender Blacklist
    |--------------------------------------------------------------------------
    |
    | Blacklist paths to exclude. You can use asterix syntax, or regular
    | expressions (without start and end markers). If a blacklist is supplied,
    | all url's will be prerendered except ones containing a blacklist path.
    | By default, a set of asset extentions are included (this is actually only
    | necessary when you dynamically provide assets via routes). Note that this
    | is the full request URI, so including starting slash and query parameter
    | string. See github.com/JeroenNoten/Laravel-Prerender for an example.
    |
    */

    'blacklist' => [
        '*.js',
        '*.css',
        '*.xml',
        '*.less',
        '*.png',
        '*.jpg',
        '*.jpeg',
        '*.svg',
        '*.gif',
        '*.pdf',
        '*.doc',
        '*.txt',
        '*.ico',
        '*.rss',
        '*.zip',
        '*.mp3',
        '*.rar',
        '*.exe',
        '*.wmv',
        '*.doc',
        '*.avi',
        '*.ppt',
        '*.mpg',
        '*.mpeg',
        '*.tif',
        '*.wav',
        '*.mov',
        '*.psd',
        '*.ai',
        '*.xls',
        '*.mp4',
        '*.m4a',
        '*.swf',
        '*.dat',
        '*.dmg',
        '*.iso',
        '*.flv',
        '*.m4v',
        '*.torrent',
        '*.eot',
        '*.ttf',
        '*.otf',
        '*.woff',
        '*.woff2'
    ],

    /*
    |--------------------------------------------------------------------------
    | Crawler User Agents
    |--------------------------------------------------------------------------
    |
    | Requests from crawlers that do not support _escaped_fragment_ will
    | nevertheless be served with prerendered pages. You can customize
    | the list of crawlers here.
    |
    */

    'crawler_user_agents' => [
        'googlebot',
        'yahoo',
        'bingbot',
        'yandex',
        'baiduspider',
        'facebookexternalhit',
        'twitterbot',
        'rogerbot',
        'linkedinbot',
        'embedly',
        'bufferbot',
        'quora link preview',
        'showyoubot',
        'outbrain',
        'pinterest',
        'pinterest/0.',
        'developers.google.com/+/web/snippet',
        'www.google.com/webmasters/tools/richsnippets',
        'slackbot',
        'vkShare',
        'W3C_Validator',
        'redditbot',
        'Applebot',
        'WhatsApp',
        'flipboard',
        'tumblr',
        'bitlybot',
        'SkypeUriPreview',
        'nuzzel',
        'Discordbot',
        'Google Page Speed',
        'Qwantify'
    ],

    /*
    |--------------------------------------------------------------------------
    | Timeout
    |--------------------------------------------------------------------------
    |
    | Specifies the Guzzle request timeout in seconds. If the request for a
    | prerendered page takes longer than this, the request will be terminated
    | and the page will be loaded without prerender. A value of 0 means no
    | timeout.
    |
    | See: https://docs.guzzlephp.org/en/stable/request-options.html#timeout
    |
    */

    'timeout' => env('PRERENDER_TIMEOUT', 0),

    /*
    |--------------------------------------------------------------------------
    | Query Parameters
    |--------------------------------------------------------------------------
    |
    | By default, request query parameters are not sent to prerender when
    | requesting the prerendered page. Setting this to true will cause the full
    | URL, including query parameters, to be sent to prerender.
    |
    */

    'full_url' => env('PRERENDER_FULL_URL', false),

];

🤍 白名单

白名单路径或模式。您可以使用通配符语法。如果提供了白名单,则只有包含白名单路径的 URL 才会被预渲染。空数组表示所有 URI 都将通过此过滤器。请注意,这是完整的请求 URI,因此包括起始斜杠和查询参数字符串。

// prerender.php:
'whitelist' => [
    '/frontend/*' // only prerender pages starting with '/frontend/'
],

🖤 黑名单

要排除的路径黑名单。您可以使用通配符语法。如果提供了黑名单,则除了包含黑名单路径的 URL 之外的所有 URL 都会被预渲染。默认情况下,包括一组资产扩展名(这实际上只在您通过路由动态提供资产时才是必要的)。请注意,这是完整的请求 URI,因此包括起始斜杠和查询参数字符串。

// prerender.php:
'blacklist' => [
    '/api/*' // do not prerender pages starting with '/api/'
],

🚧 本地测试

基于Prerender.io文档中的入门指南

  1. 本地下载并运行prerender服务器
git clone https://github.com/prerender/prerender.git
cd prerender
npm clean-install
node server.js

默认端口是3000。您可以使用以下方式在另一个端口上启动node服务器

PORT=3333 node server.js
  1. 设置prerender URL
PRERENDER_URL=https://:3000
  1. (可选) 打开您的浏览器并访问以下URL。请确保将domain.test更改为您的本地域名
https://:3000/render?url=https://domain.test
  1. 将您的页面作为爬虫进行测试。请确保将domain.test更改为您的本地域名
curl -A Googlebot https://domain.test
  1. 🎉 完成啦 —— 您应该能看到预渲染的HTML!

📝 更新日志

请参阅更新日志以获取最近更改的更多信息。

✏️ 贡献

请参阅贡献指南以获取详细信息。

🧑‍💻 安全漏洞

请查看我们的安全策略了解如何报告安全漏洞。

🎭 许可证

MIT许可证(MIT)。请参阅许可文件以获取更多信息。