kvz / elasticsearch
CakePHP插件,用于ElasticSearch
Requires
This package is not auto-updated.
Last update: 2024-09-24 07:12:01 UTC
README
最近使用Elastic search索引Firefox4的twitter流,并使其可搜索。[链接](http://pedroalves-bi.blogspot.com/2011/03/firefox-4-twitter-and-nosql.html)。它基于Lucene,有一个简单的基于JSON的接口,您可以使用它来存储对象并通过它们进行搜索(例如,甚至可以使用CURL)。
这也使得每当您的CakePHP模型更改数据时,实时更新搜索索引变得容易。因为基本上我们只需进行Curl PUT、DELETE等操作,就可以在每次afterSave和afterDelete之后也将更改应用到Elasticsearch。
此插件提供
- 自动更新索引的行为
- 用于执行完整索引填充的shell任务
- 一个通用的搜索组件,您可以将它附加到AppController上,它将拦截启用了搜索的所有模型的搜索操作。将以JSON格式返回结果,以便于AJAX集成。
安装
您需要安装PHP curl库。
服务器
CakePHP插件
作为假子模块
cd ${YOURAPP}/Plugin git clone git://github.com/kvz/cakephp-elasticsearch-plugin.git Elasticsearch
作为真实子模块
cd ${REPO_ROOT} git submodule add git://github.com/kvz/cakephp-elasticsearch-plugin.git ${YOURAPP}/Plugin/Elasticsearch
使用composer
"require": { "kvz/elasticsearch": "dev-master" }
集成
数据库
Config/database.php
<?php class DATABASE_CONFIG { public $elastic = array( 'host' => '127.0.0.1', 'port' => '9200', ); // ... etc ?>
模型
Models/Ticket.php
(最小示例)
<?php public $actsAs = array( 'Elasticsearch.Searchable' => array( ), // ... etc ); ?>
Models/Ticket.php
(带有原始SQL以处理大数据集)
<?php public $actsAs = array( 'Elasticsearch.Searchable' => array( 'index_chunksize' => 1000, // per row, not per parent object anymore.. 'index_find_params' => ' SELECT `tickets`.`cc` AS \'Ticket/cc\', `tickets`.`id` AS \'Ticket/id\', `tickets`.`subject` AS \'Ticket/subject\', `tickets`.`from` AS \'Ticket/from\', `tickets`.`created` AS \'Ticket/created\', `customers`.`customer_id` AS \'Customer/customer_id\', `customers`.`name` AS \'Customer/name\', `ticket_responses`.`id` AS \'TicketResponse/{n}/id\', `ticket_responses`.`from` AS \'TicketResponse/{n}/from\', `ticket_responses`.`created` AS \'TicketResponse/{n}/created\' FROM `tickets` LEFT JOIN `ticket_responses` ON `ticket_responses`.`ticket_id` = `tickets.id` LEFT JOIN `customers` ON `customers`.`customer_id` = `tickets`.`customer_id` WHERE 1=1 {single_placeholder} {offset_limit_placeholder} ', ), // ... etc ); ?>
Models/Ticket.php
(完整示例)
<?php public $actsAs = array( 'Elasticsearch.Searchable' => array( 'debug_traces' => false, 'searcher_enabled' => false, 'searcher_action' => 'searcher', 'searcher_param' => 'q', 'searcher_serializer' => 'json_encode', 'fake_fields' => array( '_label' => array('Product/description', 'BasketItem/description'), ), 'index_name' => 'main', 'index_chunksize' => 10000, 'index_find_params' => array( 'limit' => 1, 'fields' => array( // It's important you name your fields. 'subject', 'from', ), 'contain' => array( 'Customer' => array( // It's important you name your fields. 'fields' => array( 'id', 'name', ), ), 'TicketResponse' => array( // It's important you name your fields. 'fields' => array( 'id', 'content', 'created', ), ), 'TicketObjectLink' => array( // It's important you name your fields. 'fields' => array( 'foreign_model', 'foreign_id', ), ), 'TicketPriority' => array( // It's important you name your fields. 'fields' => array( 'code', 'from', ), ), 'TicketQueue' => array( // It's important you name your fields. 'fields' => array( 'name', ), ), ), 'order' => array( 'Ticket.id' => 'DESC', ), ), 'highlight' => array( 'pre_tags' => array('<em class="highlight">'), 'post_tags' => array('</em>'), 'fields' => array( '_all' => array( 'fragment_size' => 200, 'number_of_fragments' => 1, ), ), ), 'realtime_update' => false, 'error_handler' => 'php', 'static_url_generator' => array('{model}', 'url'), 'enforce' => array( 'Customer/id' => 123, // or a callback: '#Customer/id' => array('LiveUser', 'id'), ), 'highlight_excludes' => array( // if you're always restricting results by customer, that // query should probably not be part of your highlight // instead of dumping _all and going over all fields except Customer/id, // you can also exclude it: 'Customer/id', ), ), ); ?>
控制器
要自动在启用了Elasticsearch的所有模型上启用/<controller>/searcher
URL,请使用以下命令:
Controller/AppController.php
<?php public $components = array( 'Elasticsearch.Searcher', // ... etc ); ?>
此组件仅在实际调用Controller->modelClass时具有可搜索行为附加的Controller时才会触发。
我选择了这种方法(而不是专门的SearchesController),因为这样ACL(访问控制列表)设置更容易。例如,您可能已经为/tickets/*设置了ACL,因此/tickets/search将自动以相同的方式受限。
通用搜索
如果您想在所有模型上执行搜索,可以创建一个专门的搜索控制器并指导它搜索所有内容,如下所示:
<?php class SearchersController extends AppController { public $components = array( 'Elasticsearch.Searcher' => array( 'model' => '_all', 'leading_model' => 'Ticket', ), // ... etc ); public function searcher () { $this->Searcher->searchAction($this->RequestHandler->isAjax()); } } ?>
已知限制是Elasticsearch插件将仅查看第一个配置的模型,以查找如searcher_param
和searcher_action
之类的配置参数。
试试看
从您的shell
# Fill all indexes ./cake Elasticsearch.indexer fill _all # Fill index with tickets ./cake Elasticsearch.indexer fill Ticket # Try a ticket search from commandline ./cake Elasticsearch.indexer search Ticket Hello
从您的浏览器
http://www.example.com/tickets/searcher/q:*kevin*
jQuery集成
让我们看看一个使用jQuery UI的自动完成的集成示例。
假设您已包含该库,并且有一个具有属性id="main-search"
和target="/tickets/searcher/q:*{query}*"
的输入字段。
// Main-search $(document).ready(function () { $("#main-search").autocomplete({ source: function(request, response) { $.getJSON($("#main-search").attr('target').replace('{query}', request.term), null, response); }, delay: 100, select: function(event, ui) { var id = 0; if ((id = ui.item.id)) { location.href = ui.item.url; alert('Selected: #' + id + ': ' + ui.item.url); } return false; } }).data( "autocomplete" )._renderItem = function( ul, item ) { return $("<li></li>") .data("item.autocomplete", item) .append("<a href='" + item.url + "'>" + item.html + "<br>" + item.descr + "</a>") .appendTo(ul); }; });
注意
- 还有一个未维护的遗留cakephp 1.3分支
有用的命令
# Get Status curl -XGET 'http://127.0.0.1:9200/_all/_status?pretty=true' # Dangerous: Delete an entire index curl -XDELETE 'http://127.0.0.1:9200/main' # Dangerous: Delete an entire type curl -XDELETE 'http://127.0.0.1:9200/main/ticket' # Get all tickets curl -XGET http://127.0.0.1:9200/main/ticket/_search -d '{ "query" : { "field" : { "_all" : "**" } } }' # Get everything curl -XGET http://127.0.0.1:9200/main/_search?pretty=true -d '{ "query" : { "field" : { "_all" : "**" } }, "size" : 1000000 }' # Dangerous: Delete an entire type curl -XDELETE 'http://127.0.0.1:9200/main/ticket' # Refresh index curl -XPOST 'http://127.0.0.1:9200/main/_refresh'