Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 717B0F640 for ; Tue, 7 May 2013 23:12:27 +0000 (UTC) Received: (qmail 7558 invoked by uid 500); 7 May 2013 23:12:25 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 7488 invoked by uid 500); 7 May 2013 23:12:25 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 7480 invoked by uid 99); 7 May 2013 23:12:25 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 07 May 2013 23:12:25 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of yuzhihong@gmail.com designates 209.85.217.170 as permitted sender) Received: from [209.85.217.170] (HELO mail-lb0-f170.google.com) (209.85.217.170) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 07 May 2013 23:12:19 +0000 Received: by mail-lb0-f170.google.com with SMTP id t11so1317129lbd.1 for ; Tue, 07 May 2013 16:11:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:content-type; bh=ruRhaoXAbHhY7D1W2SZfeYz4f8+oRaVNrAD+Y+wna2c=; b=lCYYIJRv2T4/CX9ag66RZ43oTb2omUzOVKr6Ehyhkh+tFtJTryq4Wm5vAOISoNxqPs s1g0ox9CX85WwSftMU/4SY/Z1uTlrpUG/d7djwy/kW+ShXYEBUnxEgTlKmtvC0Zdpk3d iNr5fbaqlhktnk9U6x8BqWpGCmOR0ekgR1Z0bdZ6UKi/DcTa5PxQzS21UMZws8BokQeo gFmH6x56fNSTParaW/OiB44cak71nhhPYlC/GPqUVfxJXiaft6j2AuyNqq1BJuy4TR6r a3TfFk97m99/lNKL+ucLPt5/65FY/0oMiX5bcowxoMXmi1/zncEOaC3k+yxCj3ygrO2J ho+g== MIME-Version: 1.0 X-Received: by 10.112.156.6 with SMTP id wa6mr1892398lbb.113.1367968318577; Tue, 07 May 2013 16:11:58 -0700 (PDT) Received: by 10.112.136.104 with HTTP; Tue, 7 May 2013 16:11:58 -0700 (PDT) In-Reply-To: References: Date: Tue, 7 May 2013 16:11:58 -0700 Message-ID: Subject: Re: Export / Import and table splits From: Ted Yu To: user@hbase.apache.org Content-Type: multipart/alternative; boundary=089e01161bf4b161fd04dc28f1ae X-Virus-Checked: Checked by ClamAV on apache.org --089e01161bf4b161fd04dc28f1ae Content-Type: text/plain; charset=ISO-8859-1 Currently the Import tool doesn't create the table on target cluster, if we choose approach #2, Import tool should be enhanced with table creation capability. Cheers On Tue, May 7, 2013 at 4:02 PM, Jean-Marc Spaggiari wrote: > @Mohammad: The end goal is really more regarding the splits more than > the model. So I don't think Lars' options are good for this usecase. > @Mike: I agree that things were not configured correctly. User should > have had split the table before doing the import. I like the idea of > looking at the files to get the regions boundaries. That way you don't > need to have the source_table still there... > > So we have 2 different things here. > 1) a command on the shell to duplicate a table structure > 2) an option on the import command to split the table regions based on > the files names. > > If we agree on that I will open one JIRA for each... > > JM > > 2013/5/7 Michael Segel : > > Silly question... > > > > If you're doing a simple export, then you end up with all of your prior > regions as separate files in a directory, right? > > > > So in theory, you could find the first row and the last complete row of > each file and then do your pre-splits based on the start key and end key > that you find. > > > > That would be your tool so to speak. > > > > But to the point that reading back in these files will cause you to > crash your RS and HBase? > > That doesn't sound like its well tuned or right. > > > > HTH > > -Mike > > > > On May 7, 2013, at 5:29 PM, Ted Yu wrote: > > > >> I am not aware of a tool which can pre-split table using another table's > >> region boundaries as template. > >> > >> Such a tool would be nice to have. > >> > >> Cheers > >> > >> On Tue, May 7, 2013 at 3:23 PM, Jean-Marc Spaggiari < > jean-marc@spaggiari.org > >>> wrote: > >> > >>> Hi, > >>> > >>> When we are doing an export, we are only exporting the data. Then when > >>> we are importing that back, we need to make sure the table is > >>> pre-splitted correctly else we might hotspot some servers. > >>> > >>> If you simply export then import without pre-splitting at all, you > >>> will most probably brought some servers down because they will be > >>> overwhelmed with splits and compactions. > >>> > >>> Do we have any tool to pre-split a table the same way another table is > >>> already pre-splitted? > >>> > >>> Something like > >>>> duplicate 'source_table', 'target_table' > >>> > >>> Which will create a new table called 'target_table' with exactly the > >>> same parameters as 'source_table' and the same regions boundaries? > >>> > >>> If we don't have, will it be useful to have one? > >>> > >>> Or event something like: > >>>> create 'target_table', 'f1', {SPLITS_MODEL => 'source_table'} > >>> > >>> > >>> JM > >>> > > > --089e01161bf4b161fd04dc28f1ae--