hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From AnandaVelMurugan Chandra Mohan <ananthu2...@gmail.com>
Subject Re: ways to improve performance of Scan with SingleColumnValueFilter..Please help!!!
Date Fri, 29 Jun 2012 12:50:09 GMT
Hi,

Thanks for the suggestions. I too observed that scan in hbase shell takes
almost same time.

I would try to fix my HBase cluster set up.

Meanwhile, I have two questions


   - Will endpoint coprocessors help my cause? (In case cluster
   modification is beyond my control, I would lean on this approach)
   - I am logging into my Hbase node (India) and creating the table. Does
   it imply that my table is getting created in region server in my node in
   India. Once the web application deployment is complete, I will move this
   web application into US server farm. If there is a way to instruct Hbase to
   create table in US region server, I hope it will solve the issue.

Please advice. Thanks!!!


On Fri, Jun 29, 2012 at 5:52 PM, Alex Baranau <alex.baranov.v@gmail.com>wrote:

> I'd agree that HBase is not designed to be run in such "inter-continental"
> single cluster setup. Latency in communication between nodes (slaves) is
> vital for the health of the cluster.
>
> So, the short answer: just don't do it that way.
>
> What is the reason to have nodes in these locations?
>
> Alex Baranau
> ------
> Sematext :: http://blog.sematext.com/ :: Solr - Lucene - Hadoop - HBase
>
> On Fri, Jun 29, 2012 at 7:06 AM, Jean-Marc Spaggiari <
> jean-marc@spaggiari.org> wrote:
>
> > Hi Anand,
> >
> > Using HBase/Hadoop for some tests for weeks now, I figure that it's
> > very network consuming. Using it with a wireless computer was VERY
> > slow. I moved to a 1000BASE-T network and it's now WAY better. I'm not
> > sure having the nodes shared that way on internet will be efficient.
> >
> > Have you tried to put/retrieve some files from hadoop with the command
> > line tool to see the performances? Can you analyse your bandwidth
> > usage in the same time?
> >
> > --
> > JM
> >
> > 2012/6/29, AnandaVelMurugan Chandra Mohan <ananthu2050@gmail.com>:
> > > Hi,
> > >
> > > I am using HBase client API to access HBase. My HBase version is 0.92.1
> > and
> > > I have three nodes in my Hadoop cluster. Two nodes are in US and one
> node
> > > in India. HBase master is in one of the node in US.
> > >
> > > In this HBase set up, I have a table with 1200+ rows. I am developing a
> > web
> > > application which uses HBase client java API to retrieve data  from
> this
> > > table. This is a GWT web application deployed in JBoss (running in a
> > server
> > > farm in India). When I retrieve data from Hbase table based on a column
> > > value, it takes 6 mins. In code, I am doing a scan on table with
> > > "SingleColumnValueFilter". Given the number of rows, this performance
> is
> > > very bad (6 mins for 1200 records). Is there any way to improve the
> > > performance?
> > >
> > > Any help would be greatly appreciated.
> > >
> > > --
> > > Regards,
> > > Anand
> > >
> >
>



-- 
Regards,
Anand

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message