hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From cliff palmer <palmercl...@gmail.com>
Subject Re: Reopen and append to SequenceFile
Date Fri, 20 Aug 2010 10:29:46 GMT
You may want to consider using something like the *nix tee command to save a
copy of each message in a "log" directory.  A periodic job (like Flume)
would load the logged messages into sequence files.

On Fri, Aug 20, 2010 at 3:32 AM, skantsoni <Shashikant_Soni@mindtree.com>wrote:

> Hi, I am fairly new to Hadoop and HDFS and am trying to do the following:
> 1. consume some information being published by a system from AMQP
> 2. write these to SequenceFile as <Text>, <Text> into a sequence file.
> Periodically these files would be consumed by another system to generate
> reports.
> The problem is our system which consumes messages is distributed and runs
> accross multiple machines and i cannot keep the writer on a sequencefile
> open for a long time to keep appending. I want to open the file for a
> message and then close it for each message that i receive (Dont know if
> this
> is the correct approach for HDFS). But if i close the writer once i cannot
> reopen to append to it. I saw a few threads talking about merging these
> files but i felt that may be an overhead.
> I feel i am missing something on the fundamental usage of sequence files or
> is there another way to do this. Can someone please point me to the correct
> direction?
> Thanks in advance
> --
> View this message in context:
> http://old.nabble.com/Reopen-and-append-to-SequenceFile-tp29489425p29489425.html
> Sent from the Hadoop core-user mailing list archive at Nabble.com.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message