hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vidhyashankar Venkataraman <vidhy...@yahoo-inc.com>
Subject Re: Ruby Bulk Load tool in 0.90
Date Thu, 13 Jan 2011 19:00:17 GMT
>> Nicolas actually did the multi-column-family patch for trunk a few weeks
>> ago, so no need to upload that patch.
That's great!

>> If you want to have a go that would be great!
Yes, I can take a shot at it. By "create these file boundaries", did you mean creating ZK
state and a meta table entry for these boundaries? Because, after that both inc load and bulk
load become the same.

Thank you

On 1/13/11 10:30 AM, "Todd Lipcon" <todd@cloudera.com> wrote:

Hey Vidhya,

Nicolas actually did the multi-column-family patch for trunk a few weeks
ago, so no need to upload that patch. Hopefully his will work for you.

I'd love to see the ruby script gotten rid of, and improve the
LoadIncrementalHFiles/completebulkload tool to also support loading into new
tables. It should be pretty simple - if the table doesn't exist, scan the
hfiles to find their boundaries, create a table with those boundaries, and
then treat it like the incremental case. If you want to have a go that would
be great!


On Thu, Jan 13, 2011 at 8:53 AM, Vidhyashankar Venkataraman <
vidhyash@yahoo-inc.com> wrote:

> So I was using 0.89 till last week and the tables created/bulk loaded just
> fine with the script (one can only use the ruby script while creating the
> table) and yesterday was when I ran our system creating a table from
> scratch.
> I have to fix this thing anyways very soon so do let me know if you want me
> to take a stab it. And thanks for creating the jira.
> Oh, and I have my custom patch that I use for our current system for
> supporting multiple column families in bulk loads which is a little too
> non-generic to be submitted as a patch for the open source. Let me clean
> that up as well and upload it soon.
> Cheers,
> V
> On 1/12/11 10:03 PM, "Stack" <stack@duboce.net> wrote:
> I think you are right Vidhya.  The new master has a different
> mechanism assigning regions so the load_tables.rb won't work with new
> master (Let me clean out the load_table.rb in 0.90 -- I filed
> HBASE-3440 to fix).  It looks like the completebulkload from
> http://hbase.apache.org/docs/r0.89.20100924/bulk-loads.html should
> work.  It loads into pre-existing regions (I wonder why you've not
> been using this script anyways?)
> Good on you Vidhya,
> St.Ack
> On Wed, Jan 12, 2011 at 5:25 PM, Vidhyashankar Venkataraman
> <vidhyash@yahoo-inc.com> wrote:
> > I guess the master doesn't scan META periodically hence skips doing
> anything with the updated META table.
> >
> > The ruby bulk load tool then needs some repair (the tool should write ZK
> state for the new regions?).
> >
> >
> > On 1/12/11 4:40 PM, "Vidhyashankar Venkataraman" <vidhyash@yahoo-inc.com>
> wrote:
> >
> > Is load_table.rb deprecated in 0.90?
> >
> > I was trying to use load_table.rb to create a new table and bulk load
> files into it. It worked partly in the sense that the META table got
> populated, the files were moved to the appropriate location, but the server
> assignment did not happen until I restarted HBase. Is this a consequence of
> the master rewrite?
> >
> > V
> >
> >

Todd Lipcon
Software Engineer, Cloudera

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message