Mailing-List: contact user-help@hbase.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@hbase.apache.org
MIME-Version: 1.0
In-Reply-To: <1346364089.56926.YahooMailNeo@web121702.mail.ne1.yahoo.com>
References: 
 <7417F7E89C74DA42A9EAB7AD79C90C8B2A184AFD1D@princesa.mercador.local>
	<CAOcAC2GpG4xHsYOo-FUZGqCL5kOUB9wSim1MPrQu9CzAf_X-0A@mail.gmail.com>
	<CA+RK=_DCBsfDS=JbSzEU6ZjfqkbWPHL7OCiVCYS_dz2DeA9f2w@mail.gmail.com>
	<CA+RK=_ACyZVnidRYEBwdDGXbDZ+qR8aFb=gpFen8o+XcHG4NBw@mail.gmail.com>
	<1346345349.64679.YahooMailNeo@web121705.mail.ne1.yahoo.com>
	<CA+RK=_BJJXxHQmzzcXPZcirtnOn7t9+7RwcW2A0P2tcaDQ31jQ@mail.gmail.com>
	<1346364089.56926.YahooMailNeo@web121702.mail.ne1.yahoo.com>
Date: Fri, 31 Aug 2012 16:48:00 +0300
Message-ID: 
 <CA+RK=_DdFY+BYz3E0XTDyrT15iveH2ZTRdhOxSu35umB_pL8MA@mail.gmail.com>
Subject: Re: [maybe off-topic?] article: Solving Big Data Challenges for
 Enterprise Application Performance Management
From: Andrew Purtell <apurtell@apache.org>
To: lars hofhansl <lhofhansl@yahoo.com>
Cc: Andrew Purtell <apurtell@apache.org>,
 "user@hbase.apache.org" <user@hbase.apache.org>
Content-Type: multipart/alternative; boundary=bcaec501653149628e04c8900afd

--bcaec501653149628e04c8900afd
Content-Type: text/plain; charset=ISO-8859-1

Asynchbase redone with PB and attention to security would be a good place
to start. I can't commit resources in the immediate term, so that's easy
for me to say I know. Anyway seems we're on the same page wrt client.

On Friday, August 31, 2012, lars hofhansl wrote:

> Many of us have been saying for a while that the client needs love (i.e.
> needs to be rewritten) and that a new client should follow an async API
> (maybe with a thin synchronous veneer of top of it).
>
> The client is a big piece of HBase. And implementing all the aspects
> including security is a major task and nobody has committed the necessary
> resources for it, yet.
> asynchbase is a start, but it does not support many of the HBase features
> (coprocessors, security, etc).
>
> -- Lars
>
>   ------------------------------
> *From:* Andrew Purtell <apurtell@apache.org <javascript:_e({}, 'cvml',
> 'apurtell@apache.org');>>
> *To:* "user@hbase.apache.org <javascript:_e({}, 'cvml',
> 'user@hbase.apache.org');>" <user@hbase.apache.org <javascript:_e({},
> 'cvml', 'user@hbase.apache.org');>>; lars hofhansl <lhofhansl@yahoo.com<javascript:_e({}, 'cvml', 'lhofhansl@yahoo.com');>>
>
> *Sent:* Thursday, August 30, 2012 2:41 PM
> *Subject:* Re: [maybe off-topic?] article: Solving Big Data Challenges
> for Enterprise Application Performance Management
>
> I do want to take a closer look at it. Not with the intent to replace the
> PB RPC with it but its odd to have two RPC stacks. What refactoring and
> code simplification/removal opportunities are here? Don't know (yet). More
> generally, to experiment with simple native async clients.
>
> On Thursday, August 30, 2012, lars hofhansl wrote:
>
> 0.94+ has the option to run a thrift-server-thread inside the
> RegionServers. Maybe we should improve upon that?
>
>
>
> ________________________________
>  From: Andrew Purtell <apurtell@apache.org>
> To: Andrew Purtell <apurtell@apache.org>
> Cc: "user@hbase.apache.org" <user@hbase.apache.org>
> Sent: Thursday, August 30, 2012 9:41 AM
> Subject: Re: [maybe off-topic?] article: Solving Big Data Challenges for
> Enterprise Application Performance Management
>
> Just want to clarify I mean experimenting with the approach of the Thrift
> client work not use of Thrift particularly.
>
> On Thursday, August 30, 2012, Andrew Purtell wrote:
>
> > This paper could very well have benchmarked the relative performance of
> > the YCSB drivers. Some take aways for me here are:
> >
> >     - Cluster setup is too difficult still
> >
> >     - There are opportunities for autotuning that would make it easier
> for
> > users to get it right the first time and for academics and casual
> > benchmarkers alike to get a good result without becoming experts with
> HBase
> > configuration
> >
> >     - The client library has been evolving toward fully async dispatch,
> we
> > should focus on this, perhaps even consider reimplementing sync client
> on a
> > refactored async core. And look at making the Thrift based stuff FB put
> in
> > front and center, because then native clients are possible.
> >
> >     - Given the above client work, the YCSB HBase driver should have a
> > rewrite.
> >
> > On Thu, Aug 30, 2012 at 4:49 PM, Dave Wang <dsw@cloudera.com<javascript:_e({},
> 'cvml', 'dsw@cloudera.com');>
> > > wrote:
> >
> >> My reading of the paper is that they are actually not clear about
> whether
> >> or not HMasters were deployed on datanodes.
> >>
> >> I'm going to guess that they just used default configurations for HBase
> >> and
> >> YCSB, but the paper again is not specific enough.
> >>
> >> Why were they using 0.90.4 in 2012?  Would have been nice to see some of
> >> the more recent work done in the area of performance.
> >>
> >> One thing the paper does touch on is the relative difficulty of standing
> >> up
> >> the cluster, which has not changed since 0.90.4.  I think that's
> >> definitely
> >> something that could be improved upon.
> >>
> >> - Dave
> >>
> >> On Thu, Aug 30, 2012 at 6:27 AM, Cristofer Weber <
> >> cristofer.weber@neogrid.com <javascript:_e({}, 'cvml',
> >> 'cristofer.weber@neogrid.com');>> wrote:
> >>
> >> > Just read this article, "Solving Big Data Challenges for Enterprise
> >> > Application Performance Management." published this month @ Volume 5,
> >> No.12
> >> > of Proceedings of the VLDB Endowment, where they measured 6 different
> >> > databases - Project Voldemort, Redis, HBase, Cassandra, MySQL Cluster
> >> and
> >> > VoltDB - with YCSB on two different kind of clusters, Memory-bound and
> >> > Disk-bound,  and I'm in doubt about results for HBase since:
> >> >
> >> >
> >> > *         HBase version was 0.90.4
> >> >
> >> > *         Master nodes were deployed together with data nodes
> >> >
> >> > *         They didn't reported tuning parameters
> >> >
> >> > There's also a paragraph where they reported that HBase failed
> >> frequently
> >> > in non-deterministic ways while running YCSB.
> >> >
> >> > My intention with this e-mail is to look for opinions from you, who
> are
> >> > more experienced with HBase, on where this experiment's setup could be
> >> > changed to improve read operations, since in this setup HBase did not
> >> > performed as well as Cassandra and Project Voldemort.
> >> >
> >> > Here's the article:
> >> > http://vldb.org/pvldb/vol5/p1724_tilmannrabl_vldb2012.pdf and V
>
>

-- 
Best regards,

   - Andy

Problems worthy of attack prove their worth by hitting back. - Piet Hein
(via Tom White)

--bcaec501653149628e04c8900afd--