Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 66B0ED345 for ; Fri, 31 Aug 2012 13:48:04 +0000 (UTC) Received: (qmail 27850 invoked by uid 500); 31 Aug 2012 13:48:02 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 27796 invoked by uid 500); 31 Aug 2012 13:48:02 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 27788 invoked by uid 99); 31 Aug 2012 13:48:02 -0000 Received: from minotaur.apache.org (HELO minotaur.apache.org) (140.211.11.9) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 31 Aug 2012 13:48:02 +0000 Received: from localhost (HELO mail-vc0-f169.google.com) (127.0.0.1) (smtp-auth username apurtell, mechanism plain) by minotaur.apache.org (qpsmtpd/0.29) with ESMTP; Fri, 31 Aug 2012 13:48:01 +0000 Received: by vcbfl13 with SMTP id fl13so4085484vcb.14 for ; Fri, 31 Aug 2012 06:48:00 -0700 (PDT) MIME-Version: 1.0 Received: by 10.52.92.169 with SMTP id cn9mr4873048vdb.72.1346420880269; Fri, 31 Aug 2012 06:48:00 -0700 (PDT) Received: by 10.220.70.130 with HTTP; Fri, 31 Aug 2012 06:48:00 -0700 (PDT) In-Reply-To: <1346364089.56926.YahooMailNeo@web121702.mail.ne1.yahoo.com> References: <7417F7E89C74DA42A9EAB7AD79C90C8B2A184AFD1D@princesa.mercador.local> <1346345349.64679.YahooMailNeo@web121705.mail.ne1.yahoo.com> <1346364089.56926.YahooMailNeo@web121702.mail.ne1.yahoo.com> Date: Fri, 31 Aug 2012 16:48:00 +0300 Message-ID: Subject: Re: [maybe off-topic?] article: Solving Big Data Challenges for Enterprise Application Performance Management From: Andrew Purtell To: lars hofhansl Cc: Andrew Purtell , "user@hbase.apache.org" Content-Type: multipart/alternative; boundary=bcaec501653149628e04c8900afd --bcaec501653149628e04c8900afd Content-Type: text/plain; charset=ISO-8859-1 Asynchbase redone with PB and attention to security would be a good place to start. I can't commit resources in the immediate term, so that's easy for me to say I know. Anyway seems we're on the same page wrt client. On Friday, August 31, 2012, lars hofhansl wrote: > Many of us have been saying for a while that the client needs love (i.e. > needs to be rewritten) and that a new client should follow an async API > (maybe with a thin synchronous veneer of top of it). > > The client is a big piece of HBase. And implementing all the aspects > including security is a major task and nobody has committed the necessary > resources for it, yet. > asynchbase is a start, but it does not support many of the HBase features > (coprocessors, security, etc). > > -- Lars > > ------------------------------ > *From:* Andrew Purtell 'apurtell@apache.org');>> > *To:* "user@hbase.apache.org 'user@hbase.apache.org');>" 'cvml', 'user@hbase.apache.org');>>; lars hofhansl > > > *Sent:* Thursday, August 30, 2012 2:41 PM > *Subject:* Re: [maybe off-topic?] article: Solving Big Data Challenges > for Enterprise Application Performance Management > > I do want to take a closer look at it. Not with the intent to replace the > PB RPC with it but its odd to have two RPC stacks. What refactoring and > code simplification/removal opportunities are here? Don't know (yet). More > generally, to experiment with simple native async clients. > > On Thursday, August 30, 2012, lars hofhansl wrote: > > 0.94+ has the option to run a thrift-server-thread inside the > RegionServers. Maybe we should improve upon that? > > > > ________________________________ > From: Andrew Purtell > To: Andrew Purtell > Cc: "user@hbase.apache.org" > Sent: Thursday, August 30, 2012 9:41 AM > Subject: Re: [maybe off-topic?] article: Solving Big Data Challenges for > Enterprise Application Performance Management > > Just want to clarify I mean experimenting with the approach of the Thrift > client work not use of Thrift particularly. > > On Thursday, August 30, 2012, Andrew Purtell wrote: > > > This paper could very well have benchmarked the relative performance of > > the YCSB drivers. Some take aways for me here are: > > > > - Cluster setup is too difficult still > > > > - There are opportunities for autotuning that would make it easier > for > > users to get it right the first time and for academics and casual > > benchmarkers alike to get a good result without becoming experts with > HBase > > configuration > > > > - The client library has been evolving toward fully async dispatch, > we > > should focus on this, perhaps even consider reimplementing sync client > on a > > refactored async core. And look at making the Thrift based stuff FB put > in > > front and center, because then native clients are possible. > > > > - Given the above client work, the YCSB HBase driver should have a > > rewrite. > > > > On Thu, Aug 30, 2012 at 4:49 PM, Dave Wang 'cvml', 'dsw@cloudera.com');> > > > wrote: > > > >> My reading of the paper is that they are actually not clear about > whether > >> or not HMasters were deployed on datanodes. > >> > >> I'm going to guess that they just used default configurations for HBase > >> and > >> YCSB, but the paper again is not specific enough. > >> > >> Why were they using 0.90.4 in 2012? Would have been nice to see some of > >> the more recent work done in the area of performance. > >> > >> One thing the paper does touch on is the relative difficulty of standing > >> up > >> the cluster, which has not changed since 0.90.4. I think that's > >> definitely > >> something that could be improved upon. > >> > >> - Dave > >> > >> On Thu, Aug 30, 2012 at 6:27 AM, Cristofer Weber < > >> cristofer.weber@neogrid.com >> 'cristofer.weber@neogrid.com');>> wrote: > >> > >> > Just read this article, "Solving Big Data Challenges for Enterprise > >> > Application Performance Management." published this month @ Volume 5, > >> No.12 > >> > of Proceedings of the VLDB Endowment, where they measured 6 different > >> > databases - Project Voldemort, Redis, HBase, Cassandra, MySQL Cluster > >> and > >> > VoltDB - with YCSB on two different kind of clusters, Memory-bound and > >> > Disk-bound, and I'm in doubt about results for HBase since: > >> > > >> > > >> > * HBase version was 0.90.4 > >> > > >> > * Master nodes were deployed together with data nodes > >> > > >> > * They didn't reported tuning parameters > >> > > >> > There's also a paragraph where they reported that HBase failed > >> frequently > >> > in non-deterministic ways while running YCSB. > >> > > >> > My intention with this e-mail is to look for opinions from you, who > are > >> > more experienced with HBase, on where this experiment's setup could be > >> > changed to improve read operations, since in this setup HBase did not > >> > performed as well as Cassandra and Project Voldemort. > >> > > >> > Here's the article: > >> > http://vldb.org/pvldb/vol5/p1724_tilmannrabl_vldb2012.pdf and V > > -- Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White) --bcaec501653149628e04c8900afd--