incubator-chukwa-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From AD <straightfl...@gmail.com>
Subject Re: piping data into Cassandra
Date Sat, 29 Oct 2011 15:39:39 GMT
With the new imininent trunk (0.5) getting wired into HBase, does it make
sense for me to keep the Demux parser as the place to put this logic for
writing to cassandra?  Or does it make sense to implement a version
of src/java/org/apache/hadoop/chukwa/datacollection/writer/hbase/HbaseWriter.java
for Cassandra so that the collector pushes it straight?

If i want to use both HDFS and Cassandra, it seems the current pipeline
config would support this by doing something like

<property>
  <name>chukwaCollector.pipeline</name>
 <value>org.apache.hadoop.chukwa.datacollection.writer.SocketTeeWriter,org.apache.hadoop.chukwa.datacollection.writer.
*cassandra.CassandraWriter*</value>
 </property>

 Thoughts ?


On Wed, Oct 26, 2011 at 10:16 PM, AD <straightflush@gmail.com> wrote:

> yep that did it, just updated my initial_adaptors to have dataType
> TsProcessor and saw demux kick in.
>
> Thanks for the help.
>
>
>
> On Wed, Oct 26, 2011 at 9:22 PM, Eric Yang <eric818@gmail.com> wrote:
>
>> See: http://incubator.apache.org/chukwa/docs/r0.4.0/agent.html and
>> http://incubator.apache.org/chukwa/docs/r0.4.0/programming.html
>>
>> The configuration are the same for collector based demux.  Hope this
>> helps.
>>
>> regards,
>> Eric
>>
>> On Oct 26, 2011, at 4:20 PM, AD wrote:
>>
>> > Thanks.  Sorry for being dense here, but where does the data type get
>> mapped from the agent to the collector when passing data so that demux will
>> match ?
>> >
>> > On Wed, Oct 26, 2011 at 12:34 PM, Eric Yang <eric818@gmail.com> wrote:
>> > "dp" serves as two functions, first it loads data to mysql, second, it
>> runs SQL for aggregated views.  demuxOutputDir_* is created if the demux
>> mapreduce produces data.  Hence, make sure that there is a demux processor
>> mapped to your data type for the extracting process in
>> chukwa-demux-conf.xml.
>> >
>> > regards,
>> > Eric
>> >
>> > On Oct 26, 2011, at 5:15 AM, AD wrote:
>> >
>> > > Hmm, i am running bin/chukwa demux and i dont have anything past
>> dataSinkArchives, there is no directory named demuxOutputDir_*.
>> > >
>> > > Also isnt dp an aggregate view?  I need to parse the apache logs to do
>> custom reports on things like remote_host , query strings, etc so i was
>> hoping to parse the raw record and load it into Cassandra and run M/R there
>> to do the aggregate views.  I thought a new version of TSProcessor was the
>> right place here but i could be wrong.
>> > >
>> > > Thoughts?
>> > >
>> > >
>> > >
>> > > If not, how do you write a custom postProcessor?
>> > >
>> > > On Wed, Oct 26, 2011 at 12:57 AM, Eric Yang <eric818@gmail.com>
>> wrote:
>> > > Hi AD,
>> > >
>> > > Data is stored in demuxOutputDir_* by demux and there is a
>> > > postProcessorMananger (bin/chukwa dp) which monitors postProcess
>> > > directory and load data to MySQL.  For your use case, you will need to
>> > > modify PostProcessorManager.java to adopt to your use case.  Hope this
>> > > helps.
>> > >
>> > > regards,
>> > > Eric
>> > >
>> > > On Tue, Oct 25, 2011 at 6:34 PM, AD <straightflush@gmail.com> wrote:
>> > > > hello,
>> > > >  I currently push apache logs into Chukwa.  I am trying to figure
>> out how to
>> > > > get all those logs into Cassandra and run mapreduce there.  Is the
>> best
>> > > > place to do this in Demux (right my own version of TSProcessor?)
>> > > >  Also the data flow seems to miss a step.  The
>> > > > page http://incubator.apache.org/chukwa/docs/r0.4.0/dataflow.htmlsays
in
>> > > > 3.3 that
>> > > >    - demux moves complete files to:
>> dataSinkArchives/[yyyyMMdd]/*/*.done
>> > > >  - the next step is to move files
>> > > > from
>> postProcess/demuxOutputDir_*/[clusterName]/[dataType]/[dataType]_[yyyyMMdd]_[HH].R.evt
>> > > >   How do they get from dataSinkArchives to postProcess?  does this
>> run
>> > > > inside of DemuxManager or a separate process (bin/chukwa demux) ?
>> > > >  Thanks
>> > > >  AD
>> > >
>> >
>> >
>>
>>
>

Mime
View raw message