Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 3FE14D508 for ; Wed, 29 Aug 2012 15:42:52 +0000 (UTC) Received: (qmail 27970 invoked by uid 500); 29 Aug 2012 15:42:49 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 27912 invoked by uid 500); 29 Aug 2012 15:42:49 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 27877 invoked by uid 99); 29 Aug 2012 15:42:49 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 29 Aug 2012 15:42:49 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=FSL_RCVD_USER,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of lblabs@gmail.com designates 209.85.220.169 as permitted sender) Received: from [209.85.220.169] (HELO mail-vc0-f169.google.com) (209.85.220.169) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 29 Aug 2012 15:42:41 +0000 Received: by vcbfl13 with SMTP id fl13so1018067vcb.14 for ; Wed, 29 Aug 2012 08:42:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:reply-to:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=I4t+h7K9UHZQTvVPC5orehpM18BKksx0eGmqyUIkEDI=; b=oKRP5OGeXp6cFnHBjNrRLLWLLZ6oCiVvkXBmcIvyGH2H3iv6k+C+awBqz/ogcd2GIA kfBWrYYiwzcNFiFxsbopbDVlSNPnPDNaCoW0mYI2GGnaWTySiaqu6HqUcWEvUZqsNLAA 1xbx8mlCBTpCcKM+UsMnkkh+Xlf/v6EPyluF8dGpb8+5Lr7UwlUjenWZSG1XVJZZS/6E UdOa4OfoG3ZrdMpuzkuU8GIdwNcy/rpKu7HIAQUuH3rY60Ip+W9djrpifwPPKzSVGU6B aXfAerBbJPopIFU+KhFVidH/89RthjTHt3+Cq7X2CABHbyzOC0bTqUwJF5ePakniePWy Ewsg== MIME-Version: 1.0 Received: by 10.52.32.233 with SMTP id m9mr1083866vdi.88.1346254940286; Wed, 29 Aug 2012 08:42:20 -0700 (PDT) Received: by 10.58.124.234 with HTTP; Wed, 29 Aug 2012 08:42:20 -0700 (PDT) Reply-To: bing.li@asu.edu In-Reply-To: References: Date: Wed, 29 Aug 2012 23:42:20 +0800 Message-ID: Subject: Re: HBase Is So Slow To Save Data? From: Bing Li To: N Keywal Cc: user@hbase.apache.org, hbase-user@hadoop.apache.org Content-Type: multipart/alternative; boundary=bcaec51d23607e2d9b04c8696784 --bcaec51d23607e2d9b04c8696784 Content-Type: text/plain; charset=ISO-8859-1 Dear N Keywal, Thanks so much for your reply! The total amount of data is about 110M. The available memory is enough, 2G. In Java, I just set a collection to NULL to collect garbage. Do you think it is fine? Best regards, Bing On Wed, Aug 29, 2012 at 11:22 PM, N Keywal wrote: > Hi Bing, > > You should expect HBase to be slower in the generic case: > 1) it writes much more data (see hbase data model), with extra columns > qualifiers, timestamps & so on. > 2) the data is written multiple times: once in the write-ahead-log, once > per replica on datanode & so on again. > 3) there are inter process calls & inter machine calls on the critical > path. > > This is the cost of the atomicity, reliability and scalability features. > With these features in mind, HBase is reasonably fast to save data on a > cluster. > > On your specific case (without the points 2 & 3 above), the performance > seems to be very bad. > > You should first look at: > - how much is spent in the put vs. preparing the list > - do you have garbage collection going on? even swap? > - what's the size of your final Array vs. the available memory? > > Cheers, > > N. > > > > On Wed, Aug 29, 2012 at 4:08 PM, Bing Li wrote: > >> Dear all, >> >> By the way, my HBase is in the pseudo-distributed mode. Thanks! >> >> Best regards, >> Bing >> >> On Wed, Aug 29, 2012 at 10:04 PM, Bing Li wrote: >> >> > Dear all, >> > >> > According to my experiences, it is very slow for HBase to save data? Am >> I >> > right? >> > >> > For example, today I need to save data in a HashMap to HBase. It took >> > about more than three hours. However when saving the same HashMap in a >> file >> > in the text format with the redirected System.out, it took only 4.5 >> seconds! >> > >> > Why is HBase so slow? It is indexing? >> > >> > My code to save data in HBase is as follows. I think the code must be >> > correct. >> > >> > ...... >> > public synchronized void >> > AddVirtualOutgoingHHNeighbors(ConcurrentHashMap> > ConcurrentHashMap>> hhOutNeighborMap, int >> timingScale) >> > { >> > List puts = new ArrayList(); >> > >> > String hhNeighborRowKey; >> > Put hubKeyPut; >> > Put groupKeyPut; >> > Put topGroupKeyPut; >> > Put timingScalePut; >> > Put nodeKeyPut; >> > Put hubNeighborTypePut; >> > >> > for (Map.Entry> > Set>> sourceHubGroupNeighborEntry : hhOutNeighborMap.entrySet()) >> > { >> > for (Map.Entry> >> > groupNeighborEntry : sourceHubGroupNeighborEntry.getValue().entrySet()) >> > { >> > for (String neighborKey : >> > groupNeighborEntry.getValue()) >> > { >> > hhNeighborRowKey = >> > NeighborStructure.HUB_HUB_NEIGHBOR_ROW + >> > Tools.GetAHash(sourceHubGroupNeighborEntry.getKey() + >> > groupNeighborEntry.getKey() + timingScale + neighborKey); >> > >> > hubKeyPut = new >> > Put(Bytes.toBytes(hhNeighborRowKey)); >> > >> > hubKeyPut.add(Bytes.toBytes(NeighborStructure.HUB_HUB_NEIGHBOR_FAMILY), >> > Bytes.toBytes(NeighborStructure.HUB_HUB_NEIGHBOR_HUB_KEY_COLUMN), >> > Bytes.toBytes(sourceHubGroupNeighborEntry.getKey())); >> > puts.add(hubKeyPut); >> > >> > groupKeyPut = new >> > Put(Bytes.toBytes(hhNeighborRowKey)); >> > >> > >> groupKeyPut.add(Bytes.toBytes(NeighborStructure.HUB_HUB_NEIGHBOR_FAMILY), >> > Bytes.toBytes(NeighborStructure.HUB_HUB_NEIGHBOR_GROUP_KEY_COLUMN), >> > Bytes.toBytes(groupNeighborEntry.getKey())); >> > puts.add(groupKeyPut); >> > >> > topGroupKeyPut = new >> > Put(Bytes.toBytes(hhNeighborRowKey)); >> > >> > >> topGroupKeyPut.add(Bytes.toBytes(NeighborStructure.HUB_HUB_NEIGHBOR_FAMILY), >> > Bytes.toBytes(NeighborStructure.HUB_HUB_NEIGHBOR_TOP_GROUP_KEY_COLUMN), >> > >> Bytes.toBytes(GroupRegistry.WWW().GetParentGroupKey(groupNeighborEntry.getKey()))); >> > puts.add(topGroupKeyPut); >> > >> > timingScalePut = new >> > Put(Bytes.toBytes(hhNeighborRowKey)); >> > >> > >> timingScalePut.add(Bytes.toBytes(NeighborStructure.HUB_HUB_NEIGHBOR_FAMILY), >> > Bytes.toBytes(NeighborStructure.HUB_HUB_NEIGHBOR_TIMING_SCALE_COLUMN), >> > Bytes.toBytes(timingScale)); >> > puts.add(timingScalePut); >> > >> > nodeKeyPut = new >> > Put(Bytes.toBytes(hhNeighborRowKey)); >> > >> > nodeKeyPut.add(Bytes.toBytes(NeighborStructure.HUB_HUB_NEIGHBOR_FAMILY), >> > Bytes.toBytes(NeighborStructure.HUB_HUB_NEIGHBOR_NODE_KEY_COLUMN), >> > Bytes.toBytes(neighborKey)); >> > puts.add(nodeKeyPut); >> > >> > hubNeighborTypePut = new >> > Put(Bytes.toBytes(hhNeighborRowKey)); >> > >> > >> hubNeighborTypePut.add(Bytes.toBytes(NeighborStructure.HUB_HUB_NEIGHBOR_FAMILY), >> > Bytes.toBytes(NeighborStructure.HUB_HUB_NEIGHBOR_TYPE_COLUMN), >> > Bytes.toBytes(SocialRole.VIRTUAL_NEIGHBOR)); >> > puts.add(hubNeighborTypePut); >> > } >> > } >> > } >> > >> > try >> > { >> > this.neighborTable.put(puts); >> > } >> > catch (IOException e) >> > { >> > e.printStackTrace(); >> > } >> > } >> > ...... >> > >> > Thanks so much! >> > >> > Best regards, >> > Bing >> > >> > > --bcaec51d23607e2d9b04c8696784--