hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Marcos Ortiz <mlor...@uci.cu>
Subject Re: split table data into two or more tables
Date Fri, 08 Feb 2013 14:52:02 GMT

On 02/08/2013 01:59 PM, alxsss@aim.com wrote:
> Hi,
>
> The rationale is that I have a mapred job that adds new records to an hbase table, constantly.
> The next mapred job selects these new records, but it must iterate over all records and
check if it is a candidate for selection.
> Since there are too many old records iterating though them in a cluster of 2 nodes +1
master takes about 2 days. So I thought, splitting them into two tables must reduce this time,
and as soon as I figure out that there is no more new record left in one of the new tables
I will not run mapred job on it.
This use-case is very common and a good practice here is to pre-split 
the regions to control exactly where to put your data and the size of 
it, keeping
always the numbers of regions more manageable.
>
> Currently, we have 7 regions including ROOT and META.
Can you share your conf/hbase-site.xml ?

>
>
> Thanks.
> Alex.
>
>
>   
>
>   
>
> -----Original Message-----
> From: Ted Yu <yuzhihong@gmail.com>
> To: user <user@hbase.apache.org>
> Sent: Fri, Feb 8, 2013 10:40 am
> Subject: Re: split table data into two or more tables
>
>
> May I ask the rationale behind this ?
> Were you aiming for higher write throughput ?
>
> Please also tell us how many regions you have in the current table.
>
> Thanks
>
> BTW please consider upgrading to 0.94.4
>
> On Fri, Feb 8, 2013 at 10:36 AM, <alxsss@aim.com> wrote:
>
>> Hello,
>>
>> I wondered if there is a way of splitting data from one table into two or
>> more tables in hbase with iidentical schemas, i.e. if table A has 100M
>> records put 50M into table B, 50M into table C and delete table A.
>> Currently, I use hbase-0.92.1 and hadoop-1.4.0
>>
>> Thanks.
>> Alex.
>>
>   
>

-- 
Marcos Ortiz Valmaseda,
Product Manager && Data Scientist at UCI
Blog: http://marcosluis2186.posterous.com
Twitter: @marcosluis2186 <http://twitter.com/marcosluis2186>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message