Professional Documents
Culture Documents
TM
EXALEAD HTTP/REST
PUSH API GUIDE
Legal Notice
Information in this document is subject to change without notice and does not represent a
commitment on the part of Exalead S.A..
This document is Copyright © by Exalead S.A. No part of this document may be reproduced or
transmitted in any form or by any means, electronic or mechanical, including photocopying or
recording, for any purpose without the express written permission of Exalead S.A..
Exalead, ExaScript, CloudView and its associated logos are Registered Trademarks of Exalead
S.A..
This document makes reference to other names and products that are Trademarks of their
respective owners.
The software described in this document is Copyright © by Exalead S.A. and is supplied under a
licence agreement, a nondisclosure agreement, or both. It is against the law to copy or transmit
this software by any medium except as specifically sanctioned by these agreements.
Table of Contents
Preface
API Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . 9
POST add_document . . . . . . . . . . . . . . . 11
POST add_document_list . . . . . . . . . . . . . . 12
POST delete_document . . . . . . . . . . . . . . 13
POST delete_document_list . . . . . . . . . . . . . 14
POST delete_document_collection . . . . . . . . . . . . 15
GET get_document_status . . . . . . . . . . . . . . 16
GET get_document_status_list . . . . . . . . . . . . . 17
POST set_checkpoint . . . . . . . . . . . . . . . 18
GET get_checkpoint . . . . . . . . . . . . . . . 19
POST clear_all_checkpoints . . . . . . . . . . . . . 20
POST open_document_status_collection_iterator . . . . . . . . 21
POST next_document_status_collection_iterator . . . . . . . . . 22
POST next_batch_document_status_collection_iterator . . . . . . . 23
POST close_document_status_collection_iterator . . . . . . . . 24
POST open_checkpoint_iterator . . . . . . . . . . . . 25
POST next_checkpoint_iterator . . . . . . . . . . . . 26
POST next_batch_checkpoint_iterator . . . . . . . . . . . 27
POST close_checkpoint_iterator . . . . . . . . . . . . 28
POST create_partial_flush_task . . . . . . . . . . . . 29
POST get_task_status . . . . . . . . . . . . . . 30
P REFACE
Revision history
Ver. Date Author Document revision history
1.0 May 31, 2007 A. Derbel Creation of document.
4.6.0 March 4, 2008 A. Derbel New versioning and format for new exalead
one:enterprise v4.6 release.
4.6.1 March 31, 2008 A. Derbel Updates for version 2 of the Push API for exalead
one:enterprise v4.6.
Documentation set The exalead one:enterprise documentation set is as follows.The exalead one:enterprise
documentation set is as follows. The technical and user reference guides are:
Document
number Title
EN.120.001 Exalead one:enterprise Administration
EN.120.002 Exalead one:enterprise Connectors
EN.120.003 Exalead one:enterprise Security
EN.120.004 Exalead one:enterprise HTML Front-end
EN.120.007 Exalead one:enterprise User
EN.120.008 Exalead one:enterprise Installation
Document
number Title
EN.120.002.1 Exalead one:enterprise HTTP/REST Push API
EN.120.002.2 Exalead one:enterprise Push API
EN.120.018 Exalead one:enterprise Search Front-end API: XML v10
EN.120.020 Exalead one:enterprise Search Front-end API: XML v3.1
EN.120.021 Exalead one:enterprise Query Language
Document
number Title
EN.120.010 Exalead one:enterprise Categorization
EN.120.011 Exalead one:enterprise Filtering
Document audience This document has been written to explain how to use the exalead one:enterprise Push API. It is
for people who have experience with:
• HTTP API
Purpose and The Exalead PUSH API is the API available for Exalead partners and contractors to index a new
data source from the exalead one:enterprise product.
application scope
The PUSH API is natively a simple HTTP/Rest API. The API user may either use this Rest API
directly from the language of its choice, or use provided client-side wrappers for the API (C#, Java,
Ruby, Perl, ExaScript).
This document describes version 2 of the API at the HTTP/Rest level, including:
– HTTP url syntax,
– parameters convention,
– types serialization/de-serialization,
– error handling and
– security aspects.
Please refer to the Push API documentation for more details about global notions.
API users should refer to the general Push API documentation of their client-side wrappers
documentation for more detailed explanations and examples.
Acronym Means
HTTP Hypertext Transfer Protocol
URI Universal Resource Identifier
PAPI Push API
IP Internet Protocol
URL Universal Resource Locator
XML Extensible Markup Language
Terminology Some important terms used in this document are defined below.
Term Definition
Command A "Command" refers to the function requested on the server
(add_document, delete_document, ...)
URL The "URL" refers to the address on which the client must send POST or
GET request. It includes the HTTP Push Server host and port number, the
conventional prefix for the connectors command, the source name and,
lastly, the Command.
For example:
http://myexaserver:10011/private/connectors/mysource/add_document
S ECTION 1
A P I C ONVENTIONS AND
METHODS
API Conventions
The recommended way parameters should be sent to the server is specified for each
parameter in the Command description: [URL] or [FORM].
Parameter Description
GET The GET method.
POST The POST method can be encoded as MIME multipart/form-
data content-type (RFC 2388), or application/x-www-form-
urlencoded
HTTP Header To detect hazardous mixes of the two API versions between the client and the server
Parameters side, an extra Parameter is added to every request.
Parameter Description
Papi-Version The possible values for the PAPI version (at this moment)
are the following:
• PAPI_v1.0
• PAPI_v2.0
Command The Exalead PUSH API operations processing may be asynchronous. This means that
response requested add or delete operations are accepted but we don't know for sure when they
will be performed. Subsequent errors can be retrieved asynchronously, by retrieving the
status of Document.
But some errors can occurs at a lower level, here is the presentation of the general
default HTTP responses; special cases are described below for the Commands
concerned.
Push API, only specified methods are authorized for each Command.
<error>
<type> ... </type>
<short_message> ... </short_message>
<message> ... </message>
</error>
POST add_document
Parameter Description
PAPI_uri [URL] The URI parameter is directly the string of the document's URI.
PAPI_stamp [FORM] (optional) The stamp parameter is the string representing the document's Stamp.
PAPI_meta_<metaname> [FORM] The meta_* parameter is a string containing the value of the metadata referenced by
metaname.
A metadata does not have to be unique, but you have to check for which of them it
make senses to have multiple values on the metadata description page (refer to
Exalead Push API document).
PAPI_part_bytes:<name> [FORM] The part_bytes parameter is the content of the document's part that is identified by
'name'.
PAPI_part_ext:<name> [FORM] The part_ext parameter is the extension hint of the document's part that is identified
(optional) by 'name'.
PAPI_part_mime:<name> [FORM] The part_mime parameter is the mime hint of the document's part that is identified by
(optional) 'name'.
Response (default)
POST add_document_list
Parameter Description
PAPI_<id>:uri [FORM] The URI parameter is directly the string of the document's URI.
PAPI_<id>:stamp (optional) The stamp parameter is the string representing the document's Stamp.
[FORM]
PAPI_<id>:meta_<metaname> The meta_* parameter is a string containing the value of the metadata referenced by
[FORM] metaname.
PAPI_<id>:part_bytes:<name> The part_bytes parameter is the content of the document's part that is identified by
[FORM] 'name'.
PAPI_<id>:part_ext:<name> The part_ext parameter is the extension hint of the document's part that is identified
[FORM] (optional) by 'name'.
PAPI_<id>:part_mime:<name> The part_mime parameter is the mime hint of the document's part that is identified by
[FORM] (optional) 'name'.
Response (default)
POST delete_document
Parameter Description
PAPI_uri [URL] The URI parameter is directly the string of the document's uri.
Response (default)
POST delete_document_list
Parameter Description
PAPI_uriList [FORM] The uriList parameter is directly the string of the document's uriList serialized using the following
xml serialization format:
<uriList>
<uri>file://....</uri>
<uri>file://....</uri>
</uriList>
Response (default)
POST delete_document_collection
Parameter Description
PAPI_filter [URL] The filter parameter is directly the string representation of the filter.
For more details, see the filter chapter in the Exalead one:enterprise Push API Guide.
Response An error status indicating an InvalidFilterError occurred can be returned in case the
filter syntax is incorrect.
GET get_document_status
Parameter Description
PAPI_uri [URL] The URI parameter is directly the string of the document's URI.
Description Retrieve from the Indexing System the status of the document specified by the URI
parameter. The structure is serialized and returned in the response body.
Response If successful (status = OK), then the body contains the serialized form of the
DocumentStatus in XML format:
<DocumentStatus>
<uri> ... </uri>
<stamp> ... </stamp>
<status> ... </status>
<message> ... </message>
</DocumentStatus>
Where status can take the following values: EXISTS, MISSING, ERROR
GET get_document_status_list
Parameter Description
PAPI_uriList [FORM] The uriList parameter is directly the string of the document's uriList serialized using the
following XML format:
<uriList>
<uri>file://....</uri>
<uri>file://....</uri>
</uriList>
Description Retrieve from the Indexing System the status of all documents specified by the uriList
parameter. The structure is serialized and returned in the response body.
Response If successful (status = OK), then the body contains the serialized form of the
DocumentStatus[] in XML format:
<DocumentStatusList>
<DocumentStatus>
<uri> ... </uri>
<stamp> ... </stamp>
<status> ... </status>
<message> ... </message>
</DocumentStatus>
...
</DocumentStatusList>
POST set_checkpoint
Parameter Description
PAPI_checkpoint [URL] The checkpoint parameter is directly the string of the checkpoint's value.
PAPI_name(optional) This optional parameter can be used in case you need to manage many checkpoints for a
[URL] connector.
Description Set the given checkpoint. If the optional name is given, then the concerned checkpoint
is changed.
Response (default)
GET get_checkpoint
Parameter Description
PAPI_name(optional) This optional parameter can be used in case you need to manage many checkpoints for a
[URL] connector.
Description Get the last saved checkpoint. If the optional name is given, then the concerned
checkpoint is returned.
Response If successful (status = OK), then the body contains the serialized form of the checkpoint,
which is directly the string value of the checkpoint
POST clear_all_checkpoints
Response (default)
POST open_document_status_collection_iterator
Parameter Description
PAPI_filter [URL] The filter parameter is directly the string representation of the filter.
Description Open an iterator on a document collection matching the filter given as parameter.
Response If successful (status = OK), then the body contains the serialized form of the IteratorID,
which is directly the string value of the IteratorID (int).
The IteratorID will be used to address the correct iterator (in case of multiple concurrent
iterator) stored in the server context.
An error status indicating an InvalidFilterError occurred can be returned in case the
filter syntax is incorrect.
POST next_document_status_collection_iterator
Parameter Description
PAPI_iteratorID [URL] The iteratorID parameter is directly the string representation of the integer value.
Description Return the next DocumentStatus available in the enumeration associated to the
iteratorID.
Response If successful (status = OK), then the body contains the serialized form of the
DocumentStatus in XML format:
<DocumentStatus>
<uri> ... </uri>
<stamp> ... </stamp>
<status> ... </status>
<message> ... </message>
</DocumentStatus>
Where status can take the following values: EXISTS, MISSING, ERROR
If the end of the iterator has been reached, then <null/> is returned.
An error status indicating an InvalidIteratorError occurred can be returned in case the
iteratorID is invalid.
POST next_batch_document_status_collection_iterator
Parameter Description
PAPI_iteratorID The iteratorID parameter is directly the string representation of the integer value.
[URL]
PAPI_count [URL] The count parameter is directly the string representation of the integer value.
Description Return the next DocumentStatus[] available in the enumeration associated to the
iteratorID.
Response If successful (status = OK), then the body contains the serialized form of the
DocumentStatus[] in XML format:
<DocumentStatusList>
<DocumentStatus>
<uri> ... </uri>
<stamp> ... </stamp>
<status> ... </status>
<message> ... </message>
</DocumentStatus>
...
</DocumentStatusList>
POST close_document_status_collection_iterator
Parameter Description
PAPI_iteratorID The iteratorID parameter is directly the string representation of the integer value.
[URL]
Description Release resources allocated on the server side associated to the iteratorID.
Response (default)
POST open_checkpoint_iterator
Parameter Description
N/A
Description Open an iterator on the list of optional name used to create checkpoints.
Response In case of success (status = OK), then the body contains the serialized form of the
IteratorID, which is directly the string value of the IteratorID (int).
The IteratorID will be used to address the correct iterator (in case of multiple concurrent
iterator) stored in the server context.
POST next_checkpoint_iterator
Parameter Description
PAPI_iteratorID[URL] The iteratorID parameter is directly the string representation of the integer value.
Description Return the next checkpoint name available in the enumeration associated to the
iteratorID.
Response In case of success (status = OK), then the body contains the serialized form of the
Checkpoint name in XML format:
If the end of the iterator has been reached, then <null/> is returned.
An error status indicating an InvalidIteratorError occurred can be returned in case the
iteratorID is invalid.
POST next_batch_checkpoint_iterator
Parameter Description
PAPI_iteratorID[URL] The iteratorID parameter is directly the string representation of the integer value.
PAPI_count[URL] The count parameter is directly the string representation of the integer value.
Description Return the next String[] available in the enumeration associated to the iteratorID.
Response In case of success (status = OK), then the body contains the serialized form of the
String[] in XML format:
<CheckpointList>
<Checkpoint> ... </Checkpoint>
...
</CheckpointList>
POST close_checkpoint_iterator
Parameter Description
PAPI_iteratorID[URL] The iteratorID parameter is directly the string representation of the integer value.
Description Release resources allocated on the server side associated to the iteratorID.
Response (default)
POST create_partial_flush_task
Description Create on the server a partial flush (commit) operation asynchronously, and return a
TaskID that could be used to monitor the status of the task.
Response In case of success, returns an integer, which is the TaskID of the created task.
POST get_task_status
Parameter Description
PAPI_taskID[URL] The taskID parameter is directly the string representation of the integer value of the
Task we want to monitor.
<TaskStatus>
<status>1,2 or 3</status>
<message>….</message>
</TaskStatus>
Status table:
1 : RUNNING
2 : FINISHED
3 : ERROR
In case of ERROR, the message field contains the description of the error.
About Exalead
Founded in 2000 by search-engine pioneers, Exalead (www.exalead.com) is a global provider of software that is
designed to simplify all aspects of information search and retrieval for organizations of all sizes. Based on the first and
only unified technology platform for desktop, intranet or Web search, Exalead offers easier deployment,
administration and use than any other enterprise-type search software. This is true whether for one or thousands of
desktops, a small business or global enterprise, and conforms to any technology environment. It also adapts to user
habits for a uniquely satisfying search experience.
Exalead software is used by leading banking and financial services, media, consumer packaged goods, research,
retailing sports entertainment and telecommunications companies around the world, including Air Liquide, BNP
Paribas and Carlson Wagonlit. Exalead is an operating unit of Qualis, an international holding company.
Exalead - France Exalead - USA Exalead - Italy Exalead - United Kingdom Exalead - Germany
10 place de la Madeleine 576 Folsom Street - 2nd floor Corso Giuseppe Garibaldi, 86 International House Niederlassung Deutschland
75008 Paris, France San Francisco, CA 94105 20121 - MILANO, Italy Stanley Bvd, Hamilton Robert-Bosch-Strasse 7
Glasgow G72 0BN 64293 Darmstadt
Tel: +33 (0)1 55 35 26 26 Tel: +1 (415) 230-3800 Tel: +39 02 62 71 10 10 Tel: +44 (0)1698 404630 Tel: +49 6151 35 99 690-0
Fax: +33 (0)1 55 35 26 27 Fax: +1 (415) 568-3375 Fax: +39 02 62 71 10 11 Fax: +44 (0)1698 404639 Fax: +49 6151 35 99 690-35
contact@exalead.com www.exalead.com