hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marc Limotte <mslimo...@gmail.com>
Subject HBase Bulk Load script
Date Thu, 23 Dec 2010 21:12:00 GMT
Hi,

I'm using the HBase Bulk
Loader<http://archive.cloudera.com/cdh/3/hbase/bulk-loads.html>with
0.89.  Very easy to use.  I have a few of questions:

1) It seems importtsv will only accept one family at a time. It shows some
sort of security access error if I give it a column list with columns from
different families.  Is this a limitation of the bulk loader, or is this a
consequence of some security configuration somewhere?

2)  Does the bulk load process respect the hbase family's compression
setting?  If not, is there a way to trigger the compression after the fact
(major compaction, for example)?

3) Am I correct in thinking that the importtsv step can run on a separate
cluster from the hbase cluster (assuming you have an hbase client config and
libraries)?  And if so, for the completebulkload step, will I need to
manually copy the output of importtsv to the hbase cluster's HDFS?  Or can I
provide a remote hdfs path, or even an S3 path for the completebulkload
program?

Thanks for providing this tool.

Marc

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message