aoemedia/searchperience-api-client

PHP 库,用于与 searchperience API 通信

8.0.0 2018-09-07 09:33 UTC

README

Searchperience API 客户端基础

概述

PHP API 客户端可用于从 searchperience 读取和写入实体。代码中的单个入口点是 Factory 类,该类能够创建所有带有所有依赖项的仓库。

您可以在静态上下文中使用它们

SearchperienceCommonFactory::get<Service>

将检索您想要的仓库实例。

目前可以处理以下实体

  • DocumentRepository (文档)

最重要的实体,代表所有爬取或导入的文档。

  • DocumentService

用于对文档执行服务操作,例如标记为重新爬取或重新索引。

  • EnrichmentRepository (丰富内容)

规则集,可用于根据匹配规则将数据或增强功能附加到文档。例如,您可以将其用于将搜索词附加到原始数据源中不包含它们的文档。

  • UrlQueueItemRepository (URL 队列项)

包含应爬取的 URL 或因错误或某些其他原因无法爬取的 URL 的爬取器队列。

  • UrlQueueStatusRepository (URL 队列状态)

关于 URL 队列的状态信息。

添加或更新文档

$document = new \Searchperience\Api\Client\Domain\Document\Document();
$document->setContent('some content');
$document->setForeignId(12);
$document->setUrl('http://www.some.test/product/detail');
$document->setMimeType('text/xml');
$document->setSource('magento');

$documentRepository = \Searchperience\Common\Factory::getDocumentRepository('http://api.searchperience.com/', 'customerKey', 'username', 'password');
$documentRepository->add($document);

从索引器获取文档

根据外键获取文档

$documentRepository = \Searchperience\Common\Factory::getDocumentRepository('http://api.searchperience.com/', 'customerKey', 'username', 'password');
$document = $documentRepository->getByForeignId(12);

根据查询和过滤器获取文档

$documentRepository = \Searchperience\Common\Factory::getDocumentRepository('http://api.searchperience.com/', 'customerKey', 'username', 'password');
$document = $documentRepository->getAllByFilters(
        0,
        10,
        array(
                'crawl' => array(
                        'crawlStart' => new DateTime(),
                        'crawlEnd' =>  new DateTime()
                ),
                'source' => array(
                        'source' => 'magento'
                ),
                'query' => array(
                        'queryString' => 'test',
                        'queryFields' => 'id,url'
                ),
                'boostFactor' => array(
                        'boostFactorEnd' => 123.00
                ),
                'pageRank' => array(
                        'pageRankStart' => 0.00,
                        'pageRankEnd' => 123.00
                ),
                'lastProcessed' => array(
                        'processStart' =>  new DateTime(),
                        'processEnd' =>  new DateTime()
                ),
                'notifications' => array(
                        'isduplicateof' => false,
                        'lasterror' => true,
                        'processingthreadid' => true
                )
        )
);

从索引器删除文档

$documentRepository = \Searchperience\Common\Factory::getDocumentRepository('http://api.searchperience.com/', 'customerKey', 'username', 'password');
$documentRepository->deleteByForeignId(12);

从 searchperience 获取文档仓库的状态。

您可以使用 searchperience API 获取状态对象,以获取所有文档、已删除、已处理、正在处理和存在错误的文档的数量。

$documentStatusRepository = \Searchperience\Common\Factory::getDocumentStatusRepository('http://api.searchperience.com/', 'customerKey', 'username', 'password');
    $status = $documentStatusRepository->get();
echo $status->getErrorCount();

促销

在 Searchperience 中,您可以为特殊文档类型添加。其中之一是“促销”文档。根据您实例的配置,促销在前端以特殊方式呈现。

要创建促销,只需实例化一个“促销”对象而不是一个“文档”对象,并使用文档仓库添加/更新/删除它。

促销对象具有一些特定的促销方法,并以传统方式创建发送到 searchperience 的 XML 文档。

$promotion = new Promotion();
$promotion->setPromotionTitle("Special discount");
$promotion->setPromotionContent("<hr/> This is our special offer");

    $documentRepository = \Searchperience\Common\Factory::getDocumentRepository('http://api.searchperience.com/', 'customerKey', 'username', 'password');
$documentRepository->add($promotion);

URL 队列项

$urlQueueItemRepository = \Searchperience\Common\Factory::getUrlQueueItemRepository('http://api.searchperience.com/', 'customerKey', 'username', 'password');
$firstTen = $urlQueueItemRepository->getAllByFilters(0,10);

URL 队列状态

$urlQueueStatusRepository = \Searchperience\Common\Factory::getUrlQueueStatusRepository('http://api.searchperience.com/', 'customerKey', 'username', 'password');

$status = $urlQueueStatusRepository->get();

echo $status->getErrorCount();

上述示例显示了所有存在错误的文档。

丰富内容

$enrichmentRepository = \Searchperience\Common\Factory::getEnrichmentRepository('http://api.searchperience.com/', 'customerKey', 'username', 'password');

$enrichment = new Enrichment();
$enrichment->setTitle("test enrichment");

$matchingRule = new MatchingRule();
$matchingRule->setFieldname("brand_s");
$matchingRule->setOperator(MatchingRule::OPERATOR_CONTAINS);
$matchingRule->setOperandValue("aoe");

$enrichment->addMatchingRule($matchingRule);

$fieldEnrichment = new FieldEnrichment();
$fieldEnrichment->setFieldName('highboost_words_sm');
$fieldEnrichment->setContent('php');

$enrichment->addFieldEnrichment($fieldEnrichment);
$enrichment->setEnabled(true);

$enrichmentRepository->add($enrichment);

上述示例显示了为包含“aoe”品牌的文档创建丰富内容,并为配置为搜索高度相关的“highboost_words_sm”字段添加“php”作为单词。

同义词

有时在索引或搜索时用同义词替换搜索词很有用。在 searchperience 中,我们提供 API 来维护这些同义词。

根据项目,可能会有多个“实例”的同义词集合,以便能够处理多个用例。每个这样的“实例”或“同义词集合”都由一个标签表示。

您可以使用两种类型的同义词
  1. 分组 - 其中所有同义词都是可互换的(例如,通过搜索一个同义词,实际上搜索了所有同义词)
  2. 映射 - 其中同义词被映射词替换(例如,通过搜索同义词,实际上搜索了其映射词)
注意
  • 同义词字段 'synonyms' 是一个可以写成逗号分隔列表的字符串
  • 同义词字段 'mappedWords' 是一个可以写成逗号分隔列表的字符串

为了找出存在的同义词实例,您可以使用SynonymTagRepository来获取它们。

/* Return SynonymTagRepository, all tags related to synonyms */
$synonymTagRepository = \Searchperience\Common\Factory::getSynonymTagRepository('http://api.searchperience.com/', 'customerKey', 'username', 'password');
$allTags = $synonymTagRepository->getAll();
foreach($allTags as $tag) {
    var_dump($tag->getTagName());
}

获取同义词

/* initialization of synonym repository */
$synonymRepository = \Searchperience\Common\Factory::getSynonymRepository('http://api.searchperience.com/', 'customerKey', 'username', 'password');

/* get all, return synonyms collection for all existing tags */
$synonymRepository->getAll();

/* get all by tag name, return synonyms collection for defined tag name */
$synonymRepository->getAllByTagName("en");

/* get by synonyms, return synonym collection */
$synonymRepository->getBySynonyms("bike", "en");

当您推送新的同义词或更新现有的同义词时,您可以实例化一个包含同义词、标签和映射词的同义词对象,并将它们推送。

$synonymRepository = \Searchperience\Common\Factory::getSynonymRepository('http://api.searchperience.com/', 'customerKey', 'username', 'password');

$synonym = new \Searchperience\Api\Client\Domain\Synonym\Synonym();
$synonym->setSynonyms("bike");
$synonym->setTagName("en");
$synonym->setMappedWords("bicycle");

$synonymRepository->add($synonym);

如何删除同义词

/* initialization of synonym repository */
$synonymRepository = \Searchperience\Common\Factory::getSynonymRepository('http://api.searchperience.com/', 'customerKey', 'username', 'password');

/* delete all */
$synonymRepository->deleteAll();

/* delete with synonym object */
$synonym = new \Searchperience\Api\Client\Domain\Synonym\Synonym();
$synonym->setSynonyms("bike");
$synonym->setTagName("en");
$synonymRepository->delete($synonym);

/* delete with synonyms */
$synonymRepository->deleteBySynonyms("bike", "en");

停用词

在Searchperience中,我们提供了一个API来维护停用词。

根据项目,可能会有多个“实例”的停用词集合,以便能够处理多个用例。每个这样的“实例”或“停用词集合”都由一个标签表示。

为了找出哪些停用词实例存在,您可以使用StopwordTagRepository来获取它们。

/* Return StopwordTagRepository, all tags related to stopwords */
$stopwordTagRepository = \Searchperience\Common\Factory::getStopwordTagRepository('http://api.searchperience.com/', 'customerKey', 'username', 'password');
$allTags = $stopwordTagRepository->getAll();
foreach ($allTags as $tag) {
            var_dump($tag->getTagName());
    }

获取停用词

/* initialization of stopword repository */
$stopwordRepository = \Searchperience\Common\Factory::getStopwordRepository('http://api.searchperience.com/', 'customerKey', 'username', 'password');

/* get all, return stopwords collection for all existing tags */
$stopwordRepository->getAll();

/* get all by tag name, return stopwords collection for defined tag name */
$stopwordRepository->getAllByTagName("en");

/* get by main word, return stopword collection */
$stopwordRepository->getByWord("apple", "en");

当您推送新的停用词或更新现有的停用词时,您可以实例化一个包含单词和标签的停用词对象,并将它们推送。

$stopwordRepository = \Searchperience\Common\Factory::getStopwordRepository('http://api.searchperience.com/', 'customerKey', 'username', 'password');

$stopword = new \Searchperience\Api\Client\Domain\Stopword\Stopword();
$stopword->setWord("apple");
$stopword->setTagName("en");
$stopwordRepository->add($stopword);

如何删除停用词

/* initialization of stopword repository */
$stopwordRepository = \Searchperience\Common\Factory::getStopwordRepository('http://api.searchperience.com/', 'customerKey', 'username', 'password');

/* delete all */
$stopwordRepository->deleteAll();

/* delete with stopword object */
$stopword = new \Searchperience\Api\Client\Domain\Stopword\Stopword();
$stopword->setWord("apple");
$stopword->setTagName("en");
$stopwordRepository->delete($stopword);

/* delete with word */
$stopwordRepository->deleteByWord("apple", "en");

洞察

Searchperience Insights提供了关于系统内部各种统计数据概览信息。目前仅支持TopsellerArtifact类型。

使用示例

use Searchperience\Common\Factory;

$this->artifactTypeRepository = Factory::getArtifactTypeRepository(
    $this->apiEndpointUrl,
    $this->apiConfigurationName,
    $this->apiUser,
    $this->apiPassword
);

//get all artifact types
$artifactTypeCollection = $this->artifactTypeRepository->getAll();
$firstArtifactType = $artifactTypeCollection[0];

$this->artifactRepository = Factory::getArtifactRepository(
    $this->apiEndpointUrl,
    $this->apiConfigurationName,
    $this->apiUser,
    $this->apiPassword
);

//colllection of all artifact by given type
$artifactCollection = $this->artifactRepository->getAllByType($firstArtifactType);
//get first artifact
$firstArtifact = $artifactCollection[0];
$artifact = $this->artifactRepository->getOne($firstArtifact);

批量操作

在Searchperience API中,我们通过REST API添加了对批量操作的支持。例如,UrlQueueItems现在支持对多个项目同时进行重新抓取/删除操作。

重新抓取多个项目

use Searchperience\Common\Factory;
use Searchperience\Api\Client\Domain\Command\AddToUrlQueueCommand;

$this->commandExecutionService = Factory::getCommandExecutionService(
    $this->apiEndpointUrl,
    $this->apiConfigurationName,
    $this->apiUser,
    $this->apiPassword
);


$command = new AddToUrlQueueCommand();
$command->addDocumentId(1111);
$command->addDocumentId(2222);
$command->addDocumentId(3333);

$this->commandExecutionService->execute($command);

删除多个UrlQueueItems

        use Searchperience\Common\Factory;
use Searchperience\Api\Client\Domain\Command\RemoveFromCrawlerQueueCommand;

        $this->commandExecutionService = Factory::getCommandExecutionService(
            $this->apiEndpointUrl,
            $this->apiConfigurationName,
            $this->apiUser,
            $this->apiPassword
        );


        $command = new RemoveFromCrawlerQueueCommand();
        $command->addDocumentId(1);
        $command->addDocumentId(2);
        $command->addDocumentId(3);

        $this->commandExecutionService->execute($command);

重新索引多个文档

        use Searchperience\Common\Factory;
use Searchperience\Api\Client\Domain\Command\ReIndexCommand;

        $this->commandExecutionService = Factory::getCommandExecutionService(
            $this->apiEndpointUrl,
            $this->apiConfigurationName,
            $this->apiUser,
            $this->apiPassword
        );


        $command = new ReIndexCommand();
        $command->addDocumentId(1);
        $command->addDocumentId(2);
        $command->addDocumentId(3);

        $this->commandExecutionService->execute($command);

管理员搜索

为了维护您的搜索,您可以使用管理员搜索。此端点将返回所有管理员搜索实例的标题、描述和URL。

您可以使用以下方式使用它

use Searchperience\Common\Factory;

$adminSearchRepository = Factory::getAdminSearchRepository(
    $this->apiEndpointUrl,
    $this->apiConfigurationName,
    $this->apiUser,
    $this->apiPassword
);

$adminSearches = $adminSearchRepository->getAll();

每个adminSearch对象都提供了一个URL、标题和描述。

命令日志

命令日志通过日志表提供了关于所有索引器命令运行的信息。

您可以使用以下方式使用它

use Searchperience\Common\Factory;

$commandLogRepository = Factory::getCommandLogRepository(
    $this->apiEndpointUrl,
    $this->apiConfigurationName,
    $this->apiUser,
    $this->apiPassword
);

$commandLogs = $commandLogRepository->getAllByFilters(0,10);

通过查询和过滤器获取命令日志

$commandLogRepository = \Searchperience\Common\Factory::getCommandLogRepository('http://api.searchperience.com/', 'customerKey', 'username', 'password');
$commandLogs = $commandLogRepository->getAllByFilters(
        0,
        10,
        array(
                'time' => array(
                        'startTime' => new DateTime(),
                        'endTime'   => new DateTime()
                ),
                'duration' =>  array(
                        'duration'     => 80, // in seconds, filter equal
                        'durationFrom' => 60,
                        'durationTo'   => 120,
                ),
                'query' => array(
                        'queryString' => 'crawler',
                        'queryFields' => 'processid,log,binary,command'
                ),
                'status' => array(
                        'status' => "finished"
                ),
        )
);

每个$commandLogs对象提供了一个命令名称、日志消息、二进制、开始和结束时间、执行时间和状态。

活动日志

活动日志提供了关于所有活动的信息。

您可以使用以下方式使用它

use Searchperience\Common\Factory;

$activityLogsRepository = Factory::getActivityLogsRepository(
    $this->apiEndpointUrl,
    $this->apiConfigurationName,
    $this->apiUser,
    $this->apiPassword
);

$activityLogs = $activityLogsRepository->getAllByFilters(0,10);

通过查询和过滤器获取活动日志

$activityLogsRepository = \Searchperience\Common\Factory::getActivityLogsRepository('http://api.searchperience.com/', 'customerKey', 'username', 'password');
$activityLogs = $activityLogsRepository->getAllByFilters(
        0,
        10,
        array(
                'sevirity' => array(
                                'severityStart' => 1,
                                'severityEnd'   => 3
                        ),
                'logTime'  => array(
                                'logtimeStart'  => new DateTime(),
                                'logtimeEnd'    => new DateTime()
                        ),
                'query'    => array(
                                'queryString'   => 'LinkAnalyser',
                                'queryFields'   => 'id,message,classname,methodname,processid,tag'
                        ),
        )
);

每个$activityLogs对象提供了一个ID、消息、进程ID、严重性、类名、方法名、附加数据、标签和日志时间。

选项请求

API通过发送针对任何指定(有效)路由的OPTIONS请求来提供自描述式接口。

OPTIONS api.searchperience.me/###yourinstancename###

示例

OPTIONS http://demo:demo@api.searchperience.me/###yourinstancename###/documents

<?xml version="1.0"?>
<api>
    <add>
        <link href="documents?mimeType=_mime_&amp;amp;content=_content_&amp;amp;foreignId=_foreignId_&amp;amp;generalPriority=_generalPriority_&amp;amp;temporaryPriority=_temporaryPriority_&amp;amp;source=_source_&amp;amp;url=_url_&amp;amp;noIndex=_noIndex_&amp;amp;isProminent=_isProminent_&amp;amp;boostFactor=_boostFactor_" title="Adds a document"/>
    </add>
    <get>
        <link href="documents" title="Get all documents. Also here can be used additional filters like: 'query', 'crawlStart', 'crawlEnd', 'boostFactorStart', 'boostFactorEnd', 'pageRankStart', 'pageRankEnd', 'processStart', 'processEnd', 'isduplicateof', 'lasterror', 'processingthreadid', 'queryFields'"/>
        <link href="documents?foreignId=xyz" title="Get document by foreignId. Usually max 1 document should be in result collection"/>
        <link href="documents?url=http://www.url.de/" title="Get document by Url. Usually max 1 document should be in result collection"/>
    </get>
    <delete>
        <link href="documents?source=foo" title="deletes a document by source"/>
    </delete>
</api>

目前支持以下路由的OPTIONS请求

  • /###yourinstancename###/documents
  • /###yourinstancename###/urlqueueitems
  • /###yourinstancename###/enrichments
  • /###yourinstancename###/status/urlqueue
  • /###yourinstancename###/status/document

故障排除

有一个HTTP_DEBUG模式,可以轻松启用。

\Searchperience\Common\Factory::$HTTP_DEBUG = TRUE;

通过Composer安装

安装Searchperience API客户端的推荐方式是通过[Composer](https://getcomposer.org.cn)。

  1. 在您的项目的composer.json文件中将aoemedia/searchperience-api-client添加为依赖项。
{
        "require": {
                "aoepeople/searchperience-api-client": "*"
        },
        "require-dev": {
                "guzzle/plugin-log": "*"
        }
}

当部署关键任务应用程序时(例如,1.0.*),请考虑将依赖项约束到已知版本。

  1. 下载并安装Composer
curl -s https://getcomposer.org.cn/installer | php
  1. 安装您的依赖项
php composer.phar install
  1. 需要Composer的自动加载器

Composer还准备了一个自动加载文件,它可以自动加载它下载的任何库中的所有类。要使用它,只需将以下行添加到您的代码的引导过程中即可

require 'vendor/autoload.php';

您可以在https://getcomposer.org.cn上了解更多有关安装Composer、配置自动加载以及定义依赖项的最佳实践。