hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marc Limotte <mslimo...@gmail.com>
Subject Re: HBase Bulk Load script
Date Tue, 28 Dec 2010 01:07:01 GMT
Lars, Todd,

Thanks for the info.  If I understand correctly, the importtsv command line
tool will not compress by default and there is no command line switch for
it, but I can modify the source at
hbase-0.89.20100924+28/src/main/java/org/apache/hadoop/hbase/mapreduce/ImportTsv.java
to call FileOutputFormat.setCompressOutput/setOutputCompressorClass() on the
Job; in order to turn on compression.

Does that sound right?

Marc


On Thu, Dec 23, 2010 at 2:34 PM, Todd Lipcon <todd@cloudera.com> wrote:

> You beat me to it, Lars! Was writing a response when some family arrived
> for
> the holidays, and when I came back, you had written just what I had started
> :)
>
> On Thu, Dec 23, 2010 at 1:51 PM, Lars George <lars.george@gmail.com>
> wrote:
>
> > live ones and then moved into place from their temp location. Not sure
> > what happens if the local cluster has no /hbase etc.
> >
> > Todd, could you help here?
> >
>
> Yep, there is a code path where if the HFiles are on a different
> filesystem,
> it will copy them to the HBase filesystem first. It's not very efficient,
> though, so it's probably better to distcp them to the local cluster first.
>
> -Todd
> --
> Todd Lipcon
> Software Engineer, Cloudera
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message