incubator-chukwa-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Guillermo Pérez <>
Subject Re: Directly create chukwa records?
Date Fri, 26 Feb 2010 12:43:02 GMT
I see.... I'm changing then the approach to use backfill and use the
regular demux for converting the log to records. What I want to skip
is the agent / collector part since I will be using syslog-ng for
gathering data. I thought that if the data is more or less in order, I
could skip the map / redux part, and insert in HDFS directly
ChukwaRecords, but if it will be slower is a no sense.

The backfill works pretty well, renames files, so I know when I can
remove them from the local disk very easily.

One related thing is that I want to modify the "cluster" where we put
the files, because we will receive syslog data with several types of
events that we want to store in different clusters to analyze, backup,
archive separately. I have seen that you can modify the
Record.tagsField and that we use a regexp for extracting the
destination cluster. This is a bit akward, isn't? I don't want to keep
a tagsField just for that. I'm using a field "event_type" and I have
modified the extraction/engine/, so if that field
exists, "event_" + <event_type> will be used as cluster. This is the
proper way to go, or there is a better solution for this?.

Another question is where I could start looking on how to build
reports and aggregated results of the custom ChukwaRecords I'm

Guille -ℬḭṩḩø- <>

View raw message