camel-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Derek Abdine (JIRA)" <j...@apache.org>
Subject [jira] [Created] (CAMEL-8149) Support application-generated document identifiers in bulk index requests
Date Fri, 12 Dec 2014 02:05:13 GMT
Derek Abdine created CAMEL-8149:
-----------------------------------

             Summary: Support application-generated document identifiers in bulk index requests
                 Key: CAMEL-8149
                 URL: https://issues.apache.org/jira/browse/CAMEL-8149
             Project: Camel
          Issue Type: Improvement
          Components: camel-elasticsearch
    Affects Versions: 2.14.0
            Reporter: Derek Abdine
             Fix For: 2.15.0


Elasticsearch (via the elasticsearch-java transport client) provides two categories of APIs
to write and read data: Individual requests (index, get, delete) and bulk requests.

When performing bulk updates one creates individual index requests and adds them to the bulk
request. When creating an index request one can set the source document, id, etc. 

The current design of the camel-elasticsearch component controls the transformation and assembly
of an input body (json string, byte[], xcontentfactory, map) to an index request. Thus, it
is impossible to set the id on the index request that goes into a bulk action. The end result
is that the id is set by the default behavior of the underlying elasticsearch-java client
which generates a random identifier.  This is problematic in situations where control is needed
over the id, e.g. for de-duplication purposes.

My proposal is to improve the design of the producer to allow for elasticsearch-java ActionRequest
sub-classes in the message body so that upstream message processors can control the creation
of those requests.

I've attached a patch and sent a pull request on github.

Thank you!
Derek Abdine



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message