hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vidhyashankar Venkataraman <vidhy...@yahoo-inc.com>
Subject Re: Ruby Bulk Load tool in 0.90
Date Thu, 13 Jan 2011 19:10:20 GMT
>> I was figuring you can use the existing HTable create API that takes a list
>> of boundaries.
Oh great! Then this is pretty straightforward. I will send a patch soon on this for a review.
I need this for my system asap anyways.

V

On 1/13/11 11:05 AM, "Todd Lipcon" <todd@cloudera.com> wrote:

On Thu, Jan 13, 2011 at 11:00 AM, Vidhyashankar Venkataraman <
vidhyash@yahoo-inc.com> wrote:

> >> Nicolas actually did the multi-column-family patch for trunk a few weeks
> >> ago, so no need to upload that patch.
> That's great!
>
> >> If you want to have a go that would be great!
> Yes, I can take a shot at it. By "create these file boundaries", did you
> mean creating ZK state and a meta table entry for these boundaries? Because,
> after that both inc load and bulk load become the same.
>

I was figuring you can use the existing HTable create API that takes a list
of boundaries. Then you don't need to deal with ZK or META manually in any
way, and if any of that stuff changes you'll be using a supported public
API.

-Todd


> On 1/13/11 10:30 AM, "Todd Lipcon" <todd@cloudera.com> wrote:
>
> Hey Vidhya,
>
> Nicolas actually did the multi-column-family patch for trunk a few weeks
> ago, so no need to upload that patch. Hopefully his will work for you.
>
> I'd love to see the ruby script gotten rid of, and improve the
> LoadIncrementalHFiles/completebulkload tool to also support loading into
> new
> tables. It should be pretty simple - if the table doesn't exist, scan the
> hfiles to find their boundaries, create a table with those boundaries, and
> then treat it like the incremental case. If you want to have a go that
> would
> be great!
>
> -Todd
>
> On Thu, Jan 13, 2011 at 8:53 AM, Vidhyashankar Venkataraman <
> vidhyash@yahoo-inc.com> wrote:
>
> > So I was using 0.89 till last week and the tables created/bulk loaded
> just
> > fine with the script (one can only use the ruby script while creating the
> > table) and yesterday was when I ran our system creating a table from
> > scratch.
> >
> > I have to fix this thing anyways very soon so do let me know if you want
> me
> > to take a stab it. And thanks for creating the jira.
> >
> > Oh, and I have my custom patch that I use for our current system for
> > supporting multiple column families in bulk loads which is a little too
> > non-generic to be submitted as a patch for the open source. Let me clean
> > that up as well and upload it soon.
> >
> > Cheers,
> > V
> >
> > On 1/12/11 10:03 PM, "Stack" <stack@duboce.net> wrote:
> >
> > I think you are right Vidhya.  The new master has a different
> > mechanism assigning regions so the load_tables.rb won't work with new
> > master (Let me clean out the load_table.rb in 0.90 -- I filed
> > HBASE-3440 to fix).  It looks like the completebulkload from
> > http://hbase.apache.org/docs/r0.89.20100924/bulk-loads.html should
> > work.  It loads into pre-existing regions (I wonder why you've not
> > been using this script anyways?)
> >
> > Good on you Vidhya,
> > St.Ack
> >
> > On Wed, Jan 12, 2011 at 5:25 PM, Vidhyashankar Venkataraman
> > <vidhyash@yahoo-inc.com> wrote:
> > > I guess the master doesn't scan META periodically hence skips doing
> > anything with the updated META table.
> > >
> > > The ruby bulk load tool then needs some repair (the tool should write
> ZK
> > state for the new regions?).
> > >
> > >
> > > On 1/12/11 4:40 PM, "Vidhyashankar Venkataraman" <
> vidhyash@yahoo-inc.com>
> > wrote:
> > >
> > > Is load_table.rb deprecated in 0.90?
> > >
> > > I was trying to use load_table.rb to create a new table and bulk load
> > files into it. It worked partly in the sense that the META table got
> > populated, the files were moved to the appropriate location, but the
> server
> > assignment did not happen until I restarted HBase. Is this a consequence
> of
> > the master rewrite?
> > >
> > > V
> > >
> > >
> >
> >
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>
>


--
Todd Lipcon
Software Engineer, Cloudera


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message