kvz / elasticsearch

CakePHP插件,用于ElasticSearch

安装次数: 10,469

依赖项: 0

建议者: 0

安全: 0

星标: 45

关注者: 5

分支: 13

开放问题: 3

类型:cakephp-plugin

v1.0.2 2014-08-11 09:43 UTC

This package is not auto-updated.

Last update: 2024-09-24 07:12:01 UTC


README

最近使用Elastic search索引Firefox4的twitter流,并使其可搜索。[链接](http://pedroalves-bi.blogspot.com/2011/03/firefox-4-twitter-and-nosql.html)。它基于Lucene,有一个简单的基于JSON的接口,您可以使用它来存储对象并通过它们进行搜索(例如,甚至可以使用CURL)。

这也使得每当您的CakePHP模型更改数据时,实时更新搜索索引变得容易。因为基本上我们只需进行Curl PUT、DELETE等操作,就可以在每次afterSave和afterDelete之后也将更改应用到Elasticsearch。

此插件提供

  • 自动更新索引的行为
  • 用于执行完整索引填充的shell任务
  • 一个通用的搜索组件,您可以将它附加到AppController上,它将拦截启用了搜索的所有模型的搜索操作。将以JSON格式返回结果,以便于AJAX集成。

安装

您需要安装PHP curl库。

服务器

Debian/Ubuntu

CakePHP插件

作为假子模块

cd ${YOURAPP}/Plugin
git clone git://github.com/kvz/cakephp-elasticsearch-plugin.git Elasticsearch

作为真实子模块

cd ${REPO_ROOT}
git submodule add git://github.com/kvz/cakephp-elasticsearch-plugin.git ${YOURAPP}/Plugin/Elasticsearch

使用composer

"require": {
	"kvz/elasticsearch": "dev-master"
}

集成

数据库

Config/database.php

<?php
class DATABASE_CONFIG {
	public $elastic = array(
		'host' => '127.0.0.1',
		'port' => '9200',
	);
	// ... etc
?>

模型

Models/Ticket.php(最小示例)

<?php
public $actsAs = array(
	'Elasticsearch.Searchable' => array(

	),
	// ... etc
);
?>

Models/Ticket.php(带有原始SQL以处理大数据集)

<?php
public $actsAs = array(
	'Elasticsearch.Searchable' => array(
		'index_chunksize' => 1000, // per row, not per parent object anymore..
		'index_find_params' => '
			SELECT
				`tickets`.`cc` AS \'Ticket/cc\',
				`tickets`.`id` AS \'Ticket/id\',
				`tickets`.`subject` AS \'Ticket/subject\',
				`tickets`.`from` AS \'Ticket/from\',
				`tickets`.`created` AS \'Ticket/created\',
				`customers`.`customer_id` AS \'Customer/customer_id\',
				`customers`.`name` AS \'Customer/name\',
				`ticket_responses`.`id` AS \'TicketResponse/{n}/id\',
				`ticket_responses`.`from` AS \'TicketResponse/{n}/from\',
				`ticket_responses`.`created` AS \'TicketResponse/{n}/created\'
			FROM `tickets`
			LEFT JOIN `ticket_responses` ON `ticket_responses`.`ticket_id` = `tickets.id`
			LEFT JOIN `customers` ON `customers`.`customer_id` = `tickets`.`customer_id`
			WHERE 1=1
				{single_placeholder}
			{offset_limit_placeholder}
		',
	),
	// ... etc
);
?>

Models/Ticket.php(完整示例)

<?php
public $actsAs = array(
	'Elasticsearch.Searchable' => array(
		'debug_traces' => false,
		'searcher_enabled' => false,
		'searcher_action' => 'searcher',
		'searcher_param' => 'q',
		'searcher_serializer' => 'json_encode',
		'fake_fields' => array(
			'_label' => array('Product/description', 'BasketItem/description'),
		),
		'index_name' => 'main',
		'index_chunksize' => 10000,
		'index_find_params' => array(
			'limit' => 1,
			'fields' => array(
				// It's important you name your fields.
				'subject',
				'from',
			),
			'contain' => array(
				'Customer' => array(
					// It's important you name your fields.
					'fields' => array(
						'id',
						'name',
					),
				),
				'TicketResponse' => array(
					// It's important you name your fields.
					'fields' => array(
						'id',
						'content',
						'created',
					),
				),
				'TicketObjectLink' => array(
					// It's important you name your fields.
					'fields' => array(
						'foreign_model',
						'foreign_id',
					),
				),
				'TicketPriority' => array(
					// It's important you name your fields.
					'fields' => array(
						'code',
						'from',
					),
				),
				'TicketQueue' => array(
					// It's important you name your fields.
					'fields' => array(
						'name',
					),
				),
			),
			'order' => array(
				'Ticket.id' => 'DESC',
			),
		),
		'highlight' => array(
			'pre_tags' => array('<em class="highlight">'),
			'post_tags' => array('</em>'),
			'fields' => array(
				'_all' => array(
					'fragment_size' => 200,
					'number_of_fragments' => 1,
				),
			),
		),
		'realtime_update' => false,
		'error_handler' => 'php',
		'static_url_generator' => array('{model}', 'url'),
		'enforce' => array(
			'Customer/id' => 123,
			// or a callback: '#Customer/id' => array('LiveUser', 'id'),
		),
		'highlight_excludes' => array(
			// if you're always restricting results by customer, that
			// query should probably not be part of your highlight
			// instead of dumping _all and going over all fields except Customer/id,
			// you can also exclude it:
			'Customer/id',
		),
	),
);
?>

控制器

要自动在启用了Elasticsearch的所有模型上启用/<controller>/searcher URL,请使用以下命令:

Controller/AppController.php

<?php
public $components = array(
	'Elasticsearch.Searcher',
	// ... etc
);
?>

此组件仅在实际调用Controller->modelClass时具有可搜索行为附加的Controller时才会触发。

我选择了这种方法(而不是专门的SearchesController),因为这样ACL(访问控制列表)设置更容易。例如,您可能已经为/tickets/*设置了ACL,因此/tickets/search将自动以相同的方式受限。

通用搜索

如果您想在所有模型上执行搜索,可以创建一个专门的搜索控制器并指导它搜索所有内容,如下所示:

<?php
class SearchersController extends AppController {
	public $components = array(
		'Elasticsearch.Searcher' => array(
			'model' => '_all',
			'leading_model' => 'Ticket',
		),
		// ... etc
	);

	public function searcher () {
		$this->Searcher->searchAction($this->RequestHandler->isAjax());
	}
}
?>

已知限制是Elasticsearch插件将仅查看第一个配置的模型,以查找如searcher_paramsearcher_action之类的配置参数。

试试看

从您的shell

# Fill all indexes
./cake Elasticsearch.indexer fill _all

# Fill index with tickets
./cake Elasticsearch.indexer fill Ticket

# Try a ticket search from commandline
./cake Elasticsearch.indexer search Ticket Hello

从您的浏览器

http://www.example.com/tickets/searcher/q:*kevin*

jQuery集成

让我们看看一个使用jQuery UI的自动完成的集成示例。

假设您已包含该库,并且有一个具有属性id="main-search"target="/tickets/searcher/q:*{query}*"的输入字段。

// Main-search
$(document).ready(function () {
	$("#main-search").autocomplete({
		source: function(request, response) {
			$.getJSON($("#main-search").attr('target').replace('{query}', request.term), null, response);
		},
		delay: 100,
		select: function(event, ui) {
			var id = 0;
			if ((id = ui.item.id)) {
				location.href = ui.item.url;
				alert('Selected: #' +  id + ': ' + ui.item.url);
			}
			return false;
		}
	}).data( "autocomplete" )._renderItem = function( ul, item ) {
		return $("<li></li>")
			.data("item.autocomplete", item)
			.append("<a href='" + item.url + "'>" + item.html + "<br>" + item.descr + "</a>")
			.appendTo(ul);
	};
});

注意

有用的命令

# Get Status
curl -XGET 'http://127.0.0.1:9200/_all/_status?pretty=true'

# Dangerous: Delete an entire index
curl -XDELETE 'http://127.0.0.1:9200/main'

# Dangerous: Delete an entire type
curl -XDELETE 'http://127.0.0.1:9200/main/ticket'

# Get all tickets
curl -XGET http://127.0.0.1:9200/main/ticket/_search -d '{
	"query" : {
		"field" : {
			"_all" : "**"
		}
	}
}'

# Get everything
curl -XGET http://127.0.0.1:9200/main/_search?pretty=true -d '{
	"query" : {
		"field" : {
			"_all" : "**"
		}
	},
	"size" : 1000000
}'

# Dangerous: Delete an entire type
curl -XDELETE 'http://127.0.0.1:9200/main/ticket'

# Refresh index
curl -XPOST 'http://127.0.0.1:9200/main/_refresh'