Mailing-List: contact common-user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: common-user@hadoop.apache.org
Received-SPF: neutral (nike.apache.org: local policy)
MIME-Version: 1.0
In-Reply-To: 
 <CAFHQjKwpa-wQHR19wg6d_Fa_Ea+BF3rczuQPdOr1X-6wp_xuzg@mail.gmail.com>
References: 
 <CAFHQjKxYwqYRmX1a1tVFOkiOVDBe9NTY6cLuWnAKBGJf79Qx0w@mail.gmail.com>
	<CAFHQjKzZZEbaRu2ED7ofNxZwNb3ph42-JdwO6R1uPeLLBb5Dng@mail.gmail.com>
	<1135097457-1328529191-cardhu_decombobulator_blackberry.rim.net-1311199983-@b18.c2.bise7.blackberry>
	<CAFHQjKwpa-wQHR19wg6d_Fa_Ea+BF3rczuQPdOr1X-6wp_xuzg@mail.gmail.com>
Date: Mon, 6 Feb 2012 09:06:00 -0500
Message-ID: 
 <CAGtiFWU2bsKQmmuNeGu-nxvXRQ_GzEdnmKZ_rgUg2N+XG0zLug@mail.gmail.com>
Subject: Re: Can I write to an compressed file which is located in hdfs?
From: David Sinclair <dsinclair@chariotsolutions.com>
To: common-user@hadoop.apache.org
Cc: bejoy.hadoop@gmail.com
Content-Type: multipart/alternative; boundary=001517592f1c86a31804b84c2923

--001517592f1c86a31804b84c2923
Content-Type: text/plain; charset=GB2312
Content-Transfer-Encoding: quoted-printable

Hi,

You may want to have a look at the Flume project from Cloudera. I use it
for writing data into HDFS.

https://ccp.cloudera.com/display/SUPPORT/Downloads

dave

2012/2/6 Xiaobin She <xiaobinshe@gmail.com>

> hi Bejoy ,
>
> thank you for your reply.
>
> actually I have set up an test cluster which has one namenode/jobtracker
> and two datanode/tasktracker, and I have make an test on this cluster.
>
> I fetch the log file of one of our modules from the log collector machine=
s
> by rsync, and then I use hive command line tool to load this log file int=
o
> the hive warehouse which  simply copy the file from the local filesystem =
to
> hdfs.
>
> And I have run some analysis on these data with hive, all this run well.
>
> But now I want to avoid the fetch section which use rsync, and write the
> logs into hdfs files directly from the servers which generate these logs.
>
> And it seems easy to do this job if the file locate in the hdfs is not
> compressed.
>
> But how to write or append logs to an file that is compressed and located
> in hdfs?
>
> Is this possible?
>
> Or is this an bad practice?
>
> Thanks!
>
>
>
> 2012/2/6 <bejoy.hadoop@gmail.com>
>
> > Hi
> >     If you have log files enough to become at least one block size in a=
n
> > hour. You can go ahead as
> > - run a scheduled job every hour that compresses the log files for that
> > hour and stores them on to hdfs (can use LZO or even Snappy to compress=
)
> > - if your hive does more frequent analysis on this data store it as
> > PARTITIONED BY (Date,Hour) . While loading into hdfs also follow a
> > directory - sub dir structure. Once data is in hdfs issue a Alter Table
> Add
> > Partition statement on corresponding hive table.
> > -in Hive DDL use the appropriate Input format (Hive has some ApacheLog
> > Input Format already)
> >
> >
> > Regards
> > Bejoy K S
> >
> > From handheld, Please excuse typos.
> >
> > -----Original Message-----
> > From: Xiaobin She <xiaobinshe@gmail.com>
> > Date: Mon, 6 Feb 2012 16:41:50
> > To: <common-user@hadoop.apache.org>; =D9=DC=CF=FE=B1=F2<xiaobinshe@gmai=
l.com>
> > Reply-To: common-user@hadoop.apache.org
> > Subject: Re: Can I write to an compressed file which is located in hdfs=
?
> >
> > sorry, this sentence is wrong,
> >
> > I can't compress these logs every hour and them put them into hdfs.
> >
> > it should be
> >
> > I can  compress these logs every hour and them put them into hdfs.
> >
> >
> >
> >
> > 2012/2/6 Xiaobin She <xiaobinshe@gmail.com>
> >
> > >
> > > hi all,
> > >
> > > I'm testing hadoop and hive, and I want to use them in log analysis.
> > >
> > > Here I have a question, can I write/append log to  an compressed file
> > > which is located in hdfs?
> > >
> > > Our system generate lots of log files every day, I can't compress the=
se
> > > logs every hour and them put them into hdfs.
> > >
> > > But what if I want to write logs into files that was already in the
> hdfs
> > > and was compressed?
> > >
> > > Is these files were not compressed, then this job seems easy, but how
> to
> > > write or append logs into an compressed log?
> > >
> > > Can I do that?
> > >
> > > Can anyone give me some advices or give me some examples?
> > >
> > > Thank you very much!
> > >
> > > xiaobin
> > >
> >
> >
>

--001517592f1c86a31804b84c2923--