Return-Path: Delivered-To: apmail-hadoop-hbase-user-archive@minotaur.apache.org Received: (qmail 31309 invoked from network); 15 Oct 2009 08:35:55 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 15 Oct 2009 08:35:55 -0000 Received: (qmail 26985 invoked by uid 500); 15 Oct 2009 08:35:55 -0000 Delivered-To: apmail-hadoop-hbase-user-archive@hadoop.apache.org Received: (qmail 26945 invoked by uid 500); 15 Oct 2009 08:35:55 -0000 Mailing-List: contact hbase-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hbase-user@hadoop.apache.org Delivered-To: mailing list hbase-user@hadoop.apache.org Received: (qmail 26935 invoked by uid 99); 15 Oct 2009 08:35:55 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 15 Oct 2009 08:35:55 +0000 X-ASF-Spam-Status: No, hits=-2.6 required=5.0 tests=BAYES_00,HTML_MESSAGE X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of anty.rao@gmail.com designates 209.85.222.195 as permitted sender) Received: from [209.85.222.195] (HELO mail-pz0-f195.google.com) (209.85.222.195) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 15 Oct 2009 08:35:49 +0000 Received: by pzk33 with SMTP id 33so621588pzk.2 for ; Thu, 15 Oct 2009 01:35:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type; bh=2GoMazTKZXDZi3+ibxbUxPnLSo50iro55G/nEnE/duY=; b=S7L4xjC8z2eGlLn0QTngF16c6xhaF+XM5d1dZagzz37Mldnldym88QvqSl1pPBLVJq bffWopdLYVcPT087X3K3Qe6XWzxiakkdNqRrKy2RwkOiYhUpoOhm6wMUPBB9R/Ft7qYc 06cgelN94Ak3UEoP4h7VJFdhYSih4voh5rfAk= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=RIXuIubnZMthi8hfebQXwj6KaaHAAX7dmx6AQWNOiYY5EeyfTZE+01Yj7pvpHH6c7J AHFILHY1+ZkYQmYpfkoNqZtZpykVNHBbvKYHSC9pk3MwbyXaWy/ZoLZU2SCl+2cDBuqs Ah6byNnKZDohPK1IPAHraJ5PomuzBXUtYL4yU= MIME-Version: 1.0 Received: by 10.114.45.13 with SMTP id s13mr17274531was.167.1255595729098; Thu, 15 Oct 2009 01:35:29 -0700 (PDT) In-Reply-To: References: <7c962aed0910102354o33e637a7k49c53851395861bd@mail.gmail.com> Date: Thu, 15 Oct 2009 16:35:29 +0800 Message-ID: Subject: Re: about HBASE-48 From: Anty To: hbase-user@hadoop.apache.org Content-Type: multipart/alternative; boundary=00504502964d6a1a670475f52930 --00504502964d6a1a670475f52930 Content-Type: text/plain; charset=ISO-8859-1 Hi: stack. I did the test last time assuming the talbe xyz was a new table,does the script also works if the table xyz already exists? ./bin/hbase org.jruby.Main bin/loadtable.rb xyz /tmp/testWritingPEData/ On Sun, Oct 11, 2009 at 4:51 PM, Anty wrote: > Hi: > stack,thanks for your replying. > I just use the deault hash partitioner.I am a HBase newbie,but i will > do my best to work on this issue fellowing HBASE-1901. > > On Sun, Oct 11, 2009 at 2:54 PM, stack wrote: > >> On Sat, Oct 10, 2009 at 10:54 PM, Anty wrote: >> >> > Hi: >> > statck >> > i did some tests on bulk load tools of HBASE-48. >> > >> >> Thanks for trying it out. >> >> >> > I took files made by TestHFileOutputFormat test and passed them to the >> > script you wrote.It did works ,but it seems to be something unusual.For >> > each >> > region ,the STARTKEY and ENDKEY is nearly the same,the ENDKY is bigger >> than >> > STARTKEY by nearly 1,e.g. >> > STARTKEY=>'0000009447',ENDKY=>'0000009448'; >> > STARTKEY=>'0000020476',ENDKY=>'0000020477'; >> > ... >> > >> > >> Did you do your own partitioner or just use default hash partitioner? >> >> >> >> > i also have some doubts about TestHFileOutputFormat,the default >> > partitioner is hash partitioner,however ,the hash partitioner can't meet >> > requirements of TestHFileOutputFormat ,just as you said we need to >> ensure a >> > total ordering of all keys and we need to supply a partitioner that does >> > total ordering(but you didn't add a new partitioner in >> > TestHFileOutputFormat). >> > >> >> This is broke then as you point out. We should make something like what >> is >> described in https://issues.apache.org/jira/browse/HBASE-1901 for >> TestHFileOutputFormat? >> >> >> >> >> > so ,I think TestHFileOutputFormat use the hash partitionar ,it does >> not >> > do totoal ordering,different regions would have rows intercross ,which >> is >> > not correct for hbase.And I found the firstKey,lastKey of the files mady >> by >> > TestHFileOutputFormat is indeed intercross. >> > if the bulk tools is just the beginning,needed further improvement?I >> > think the bulk tools is very usefull. >> > >> > >> Can you help us improve it? What do you think we need to do next >> (hbase-901?) >> >> Thanks for writing Anty Rao. >> St.Ack >> > > > > -- > Anty Rao > -- Anty Rao --00504502964d6a1a670475f52930--