flume-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sanjay Ramanathan <sanjay.ramanat...@lucidworks.com>
Subject Query regarding readMultiLine in Morphlines config
Date Tue, 15 Jul 2014 23:36:37 GMT

I have a log file with multiple records. (1 line= 1 record).

I want to send N lines (say 20) at a time to morphlines, and then send it to Solr as a single
Solr document.

(This is an experiment to see if the performance is better than the regular way, of using
readLine and parsing each log line as a solarDocument).

The number of documents is going to be in billions.

I had a look at the readMultiLine documentation present here: http://kitesdk.org/docs/current/kite-morphlines/morphlinesReferenceGuide.html#/readMultiLine

I would like to know how to effectively use readMultiLine(if it is possible), to tell readMultiLine
to pick up 20 lines/records in one go, and create 20 fields with the text of each line. (use
a counter within the regex, or something similar).

Kindly let me know if you have worked on something similar, or redirect me to some informative
pages for similar problem statement.


Sanjay Ramanathan

View raw message