crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Christian Tzolov <christian.tzo...@gmail.com>
Subject Crunch integration with ElasticSearch
Date Mon, 08 Apr 2013 03:32:15 GMT
I've been working on Crunch - ElasticSearch (http://www.elasticsearch.org/)
 integration over the weekend :)

Here is my first prototype:
https://github.com/tzolov/elasticsearch-hadoop#crunch and a sample
application: http://bit.ly/Y7lasW.

It implements ES Source and Target on top of the ES-Hadoop's (
https://github.com/elasticsearch/elasticsearch-hadoop) ESInputFormat and
ESOutputFormat.

Not sure though what is the best/right way to build Source/Targets for new
Input/Output Formats? Any suggestions, references?

The write to ES is tricky and at the moment looks more like a hack (see the
doc).

Cheers
Chris

(P.S The prototype doesn't support AvroTypeFamily yet but I've been looking
at jackson-dataformat-avro kind of solution (ES-Hadoop relies on Jackson
for the JSON serialisation)

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message