chukwa-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Yang <ey...@yahoo-inc.com>
Subject Re: PipelineStageWriter doesn't work as expected
Date Fri, 18 Dec 2009 17:16:29 GMT
Correction, the HDFS has been written to HDFS correctly.  Data were stuck at
post data processing because the postProcess program crashed.  I still need
to determine the cause of postProcess crash.  I think the modified
SeqFileWriter does what I wanted, and I will implement next.add() to ensure
the ordering can be interchanged.

Regards,
Eric

On 12/18/09 8:59 AM, "Eric Yang" <eyang@yahoo-inc.com> wrote:

> I like to make a T on the incoming data.  One writer goes into HDFS, and
> another writer enable real time pub/sub to monitor the data.  In my case,
> the data are mirrored, not filtered.  However, I am not getting the right
> result because it seems the data isn't getting written into HDFS regardless
> the ordering of the writer.
> 
> Regards,
> Eric
> 
> On 12/17/09 9:53 PM, "Ariel Rabkin" <asrabkin@gmail.com> wrote:
> 
>> What's the use case for this?
>> 
>> The original motivation for pipelined writers was so that we could do
>> things like filtering before data got written.  Then it occurred to me
>> that SocketTeeWriter fit fairly naturally into a pipeline.
>> 
>> Putting it "after" seq file writer wouldn't be too bad --
>> SeqFileWriter.add() would need to call next.add().  But I would be
>> hesitant to commit that change, without a really clear use case.
>> 
>> --Ari
>> 
>> On Thu, Dec 17, 2009 at 8:39 PM, Eric Yang <eyang@yahoo-inc.com> wrote:
>>> It works fine after I put SocketTeeWriter first.  What needs to be
>>> implemented in SeqFileWriter to be able to pipe correctly?
>>> 
>>> Regards,
>>> Eric
>>> 
>>> On 12/17/09 5:26 PM, "asrabkin@gmail.com" <asrabkin@gmail.com> wrote:
>>> 
>>>> Put the SocketTeeWriter first.
>>>> 
>>>> sent from my iPhone; please excuse typos and brevity.
>>>> 
>>>> On Dec 17, 2009, at 8:12 PM, Eric Yang <eyang@yahoo-inc.com> wrote:
>>>> 
>>>>> Hi all,
>>>>> 
>>>>> I'd setup SocketTeeWriter by itself, and having data stream to next
>>>>> socket
>>>>> reader program.  When I tried to configure two writers, i.e.,
>>>>> SeqFileWriter
>>>>> follow by SocketTeeWriter.  It doesn't work because SeqFileWriter
>>>>> isn't
>>>>> extending PipelineableWriter.  I went ahead to extend SeqFileWriter
as
>>>>> PipelineableWriter and do that and implemented setNextStage method,
>>>>> and
>>>>> configured collector with:
>>>>> 
>>>>>  <property>
>>>>>    <name>chukwaCollector.writerClass</name>
>>>>> 
>>>>> <value>
>>>>> org.apache.hadoop.chukwa.datacollection.writer.PipelineStageWriter</v
>>>>> alue>
>>>>>  </property>
>>>>> 
>>>>>  <property>
>>>>>    <name>chukwaCollector.pipeline</name>
>>>>> 
>>>>> <value>
>>>>> org.apache.hadoop.chukwa.datacollection.writer.SeqFileWriter,org.apac
>>>>> he.hadoop.chukwa.datacollection.writer.SocketTeeWriter</value>
>>>>>  </property>
>>>>> 
>>>>> SeqFileWriter writes the data correctly, but when connect to
>>>>> SocketTeeWriter, there was no data visible in SocketTeeWriter.
>>>>> Commands
>>>>> works fine, but data streaming doesn't happen.  How do I configure the
>>>>> collector and PipelineStageWriter to be able to write data into
>>>>> multiple
>>>>> writer?  Is there something on SeqFileWriter that could prevent this
>>>>> from
>>>>> working?
>>>>> 
>>>>> Regards,
>>>>> Eric
>>>>> 
>>> 
>>> 
>> 
>> 
> 


Mime
View raw message