Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A3BF39DA3 for ; Tue, 10 Apr 2012 23:53:59 +0000 (UTC) Received: (qmail 13126 invoked by uid 500); 10 Apr 2012 23:53:57 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 13075 invoked by uid 500); 10 Apr 2012 23:53:57 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 13067 invoked by uid 99); 10 Apr 2012 23:53:57 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 10 Apr 2012 23:53:57 +0000 X-ASF-Spam-Status: No, hits=0.7 required=5.0 tests=RCVD_IN_DNSWL_NONE,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [98.139.91.92] (HELO nm22.bullet.mail.sp2.yahoo.com) (98.139.91.92) by apache.org (qpsmtpd/0.29) with SMTP; Tue, 10 Apr 2012 23:53:49 +0000 Received: from [98.139.91.67] by nm22.bullet.mail.sp2.yahoo.com with NNFMP; 10 Apr 2012 23:53:29 -0000 Received: from [72.30.22.203] by tm7.bullet.mail.sp2.yahoo.com with NNFMP; 10 Apr 2012 23:53:29 -0000 Received: from [127.0.0.1] by omp1065.mail.sp2.yahoo.com with NNFMP; 10 Apr 2012 23:53:28 -0000 X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 968142.79215.bm@omp1065.mail.sp2.yahoo.com Received: (qmail 73574 invoked by uid 60001); 10 Apr 2012 23:53:28 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1334102008; bh=Qk72vLYtBezYSLNw3RFdrnKuunWnuYfgE8t2jJk/1I4=; h=X-YMail-OSG:Received:X-RocketYMMF:X-Mailer:References:Message-ID:Date:From:Reply-To:Subject:To:In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding; b=XbYL6Xorabi10toQ+H+HiQNNcjPxgh4QxBaO6fgzG7DEl4z7/mxSs/GAdMmdpC8NquuZGw2OKS5oOew2/8ZtLZCkHNYQ9KEIMGoH3BV2rbmPYESRPclNahg4YFmqSSx4Ydmd1m6MbatC4X2QZyVPx0h1t+s+82NBmrjjO77tkeo= X-YMail-OSG: bkly3nQVM1kkoGT6rn3yI756X2Xcx1dqnXJqfs7VV.458RI oVNy8_YOqYK36NCjjw4tLOGW4sLziiNp2wLNkR4eML6RiAxE7kH2z65SjgV_ mvmr7LdtT98KJum7nUbj.ZYtCITJPS7kx7eHtBbzi8gKIt9AvqTzaraaaEzW 1iXZwEtUXd4aReLeT5pvj36mLhkDV2Qqv9xbcNQyXaD591nnftxafIjqZttE AohLBAXeOHvd3UuB7TAzbxz7r50YAJ0dtZibDg2HxyEf_DhfsLs90U_mQnv3 AooPqH.eBZyQLe3b49BUDGUc6mZ_H4LRIByyYNVet0.xMI9GA78an3LG5dT2 7l.tucYw1Jf0MG3OJ5m8eg.IlKOB1S4r2zHp_0s9NY46iSU1zjOfFggibI.W PNE042TaipsZlBDJq_8v4qkGsRX8R83vonMxoS6s9b7PHyRCmi.3cF6YjZ6l 5JR.TAL_4YgfUif2vYv2O3wKjVXxBXFDinnUha1W63HRaiqHRbW7QCBO.IAO jjsGoyNGX_VrxsF.IMFW7GWPfe2mh.eX7DiYKmkrox9cbdLky.yQUx2..f_u .wIQTBY2IK..4UQmMQibVM0G58VPWPSIYLDCGGb__o7nVogG6nBgX1Qng1Sm hgg0NWg-- Received: from [69.231.24.241] by web164501.mail.gq1.yahoo.com via HTTP; Tue, 10 Apr 2012 16:53:28 PDT X-RocketYMMF: apurtell X-Mailer: YahooMailWebService/0.8.117.340979 References: <1334080891.36223.YahooMailNeo@web164503.mail.gq1.yahoo.com> Message-ID: <1334102008.40388.YahooMailNeo@web164501.mail.gq1.yahoo.com> Date: Tue, 10 Apr 2012 16:53:28 -0700 (PDT) From: Andrew Purtell Reply-To: Andrew Purtell Subject: Re: Add client complexity or use a coprocessor? To: "user@hbase.apache.org" In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org > Even my implementation of an atomic increment=0A> (using a coprocessor) i= s two orders of magnitude slower than the=0A> provided implementation.=A0 A= re there properties inherent to=0A> coprocessors or Incrementors that would= force this kind of performance=0A> difference?=0A=0A=0ANo.=0A=0A=0AYou may= be seeing a performance difference if you are packing multiple Increments = into one round trip but not doing a similar kind of batching if calling a c= ustom endpoint. Each Endpoint invocation is a round trip unless you do some= thing like:=0A=0A=A0=A0=A0 List actions =3D new ArrayList();=A0= =A0=A0 actions.add(new Exec(conf, row, protocol, method, ...));=0A=0A=A0=A0= =A0 actions.add(new Exec(conf, row, protocol, method, ...));=0A=0A=A0=A0=A0= actions.add(new Exec(conf, row, protocol, method, ...));=0A=0A=A0=A0=A0 Ob= ject[] results =3D table.batch(actions);=0A=A0=A0=A0 ...=0A=0A=0AI've not p= ersonally tried that particular API combination but don't see why it would = not be possible.=0A=0A=0ABeyond that, I'd suggest running a regionserver wi= th your coprocessor installed under a profiler to see if you have monitor c= ontention or a hotspot or similar. It could be something unexpected.=0A=0A= =0A> Can you think of an efficient way to implement an atomic bitfield=0A> = (other than adding it as a separate feature like atomic increments)?=0A=0AI= think the idea of an atomic bitfield operation as part of the core API is = intriguing. It has applicability to your estimator use case and I can think= of a couple of things I could use it for. If there is more support for thi= s idea, this may be something to consider.=0A=0A=0ABest regards,=0A=0A=0A= =A0 =A0 - Andy=0A=0AProblems worthy of attack prove their worth by hitting = back. - Piet Hein (via Tom White)=0A=0A=0A=0A----- Original Message -----= =0A> From: Tom Brown =0A> To: user@hbase.apache.org; = Andrew Purtell =0A> Cc: =0A> Sent: Tuesday, April 10, = 2012 3:53 PM=0A> Subject: Re: Add client complexity or use a coprocessor?= =0A> =0A> Andy,=0A> =0A> I have attempted to use coprocessors to achieve a = passable performance=0A> but have failed so far. Even my implementation of = an atomic increment=0A> (using a coprocessor) is two orders of magnitude sl= ower than the=0A> provided implementation.=A0 Are there properties inherent= to=0A> coprocessors or Incrementors that would force this kind of performa= nce=0A> difference?=0A> =0A> Can you think of an efficient way to implement= an atomic bitfield=0A> (other than adding it as a separate feature like at= omic increments)?=0A> =0A> Thanks!=0A> =0A> --Tom=0A> =0A> On Tue, Apr 10, = 2012 at 12:01 PM, Andrew Purtell =0A> wrote:=0A>> To= m,=0A>>> I am a big fan of the Increment class. Unfortunately, I'm not doi= ng=0A>>> simple increments for the viewer count. I will be receiving dupli= cate=0A>>> messages from a particular client for a specific cube cell, and= =0A> don't=0A>>> want them to be counted twice=0A>> =0A>> Gotcha.=0A>> = =0A>>> I created an RPC endpoint coprocessor to perform this function but= =0A>>> performance suffered heavily under load (it appears that the endpoi= nt=0A>>> performs all functions in serial).=0A>> =0A>> Did you serialize = access to your data structure(s)?=0A>> =0A>>> When I tried implementing it= as a region observer, I was unsure of how=0A>>> to correctly replace the = provided "put" with my own. When I =0A> issued a=0A>>> put from within "pr= ePut", the server blocked the new put =0A> (waiting for=0A>>> the "prePut"= to finish). Should I be attempting to modify the =0A> WALEdit=0A>>> objec= t?=0A>> =0A>> You can add KVs to the WALEdit. Or, you can get a reference = to the =0A> Put's familyMap:=0A>> =0A>> =A0=A0=A0 Map> familyMap =3D put.getFamilyMap();=0A>> =0A>> and if you modify the map= , you'll change what gets committed.=0A>> =0A>>> Is there a way to extend = the functionality of "Increment" to =0A> provide=0A>>> arbitrary bitwise o= perations on a the contents of a field?=0A>> =0A>> As a matter of design, = this should be a new operation. It does sound =0A> interesting and useful, = some sort of atomic bitfield.=0A>> =0A>> =0A>> Best regards,=0A>> =0A>> = =A0 =A0 - Andy=0A>> =0A>> Problems worthy of attack prove their worth by h= itting back. - Piet Hein =0A> (via Tom White)=0A>> =0A>> =0A>> =0A>> -----= Original Message -----=0A>>> From: Tom Brown =0A>>>= To: user@hbase.apache.org=0A>>> Cc:=0A>>> Sent: Monday, April 9, 2012 1= 0:14 PM=0A>>> Subject: Re: Add client complexity or use a coprocessor?=0A>= >> =0A>>> Andy,=0A>>> =0A>>> I am a big fan of the Increment class. Unfor= tunately, I'm not doing=0A>>> simple increments for the viewer count. I wi= ll be receiving duplicate=0A>>> messages from a particular client for a sp= ecific cube cell, and =0A> don't=0A>>> want them to be counted twice (my s= tats don't have to be 100%=0A>>> accurate, but the expected rate of duplic= ates will be higher than the=0A>>> allowable error rate).=0A>>> =0A>>> I = created an RPC endpoint coprocessor to perform this function but=0A>>> per= formance suffered heavily under load (it appears that the endpoint=0A>>> p= erforms all functions in serial).=0A>>> =0A>>> When I tried implementing i= t as a region observer, I was unsure of how=0A>>> to correctly replace the= provided "put" with my own. When I =0A> issued a=0A>>> put from within "p= rePut", the server blocked the new put =0A> (waiting for=0A>>> the "prePut= " to finish). Should I be attempting to modify the =0A> WALEdit=0A>>> obje= ct?=0A>>> =0A>>> Is there a way to extend the functionality of "Increment"= to =0A> provide=0A>>> arbitrary bitwise operations on a the contents of a= field?=0A>>> =0A>>> Thanks again!=0A>>> =0A>>> --Tom=0A>>> =0A>>>> If i= t helps, yes this is possible:=0A>>>> =0A>>>>> =A0=A0Can I observe updates = to a=0A>>>>> =A0=A0particular table and replace the provided data with my o= wn? =0A> (The=0A>>>>> =A0=A0client calls "put" with the actual user ID, my = =0A> co-processor=0A>>> replaces=0A>>>>> =A0=A0it with a computed value, s= o the actual user ID never gets =0A> stored in=0A>>>>> =A0=A0HBase).=0A>>>>= =0A>>>> Since your option #2 requires atomic updates to the data structur= e, =0A> have you=0A>>> considered native=0A>>>> atomic increments? See=0A= >>>> =0A>>>> =0A> http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/c= lient/HTable.html#incrementColumnValue%28byte[],%20byte[],%20byte[],%20long= ,%20boolean%29=0A>>>> =0A>>>> =0A>>>> or=0A>>>> =0A>>>> =0A> http://hbase.= apache.org/apidocs/org/apache/hadoop/hbase/client/Increment.html=0A>>>> =0A= >>>> The former is a round trip for each value update. The latter allows = =0A> you to=0A>>> pack multiple updates=0A>>>> into a single round trip. = This would give you accurate counts even =0A> with=0A>>> concurrent writer= s.=0A>>>> =0A>>>> It should be possible for you to do partial aggregation = on the =0A> client side=0A>>> too whenever parallel=0A>>>> requests coloc= ate multiple updates to the same cube within some =0A> small window=0A>>> = of time.=0A>>>> =0A>>>> Best regards,=0A>>>> =0A>>>> =0A>>>> =A0 =A0 - An= dy=0A>>>> =0A>>>> Problems worthy of attack prove their worth by hitting b= ack. - Piet =0A> Hein=0A>>> (via Tom White)=0A>>>> =0A>>>> ----- Original= Message -----=0A>>>>> =A0=A0From: Tom Brown =0A>>>>>= =A0=A0To: user@hbase.apache.org=0A>>>>> =A0=A0Cc:=0A>>>>> =A0=A0Sent: Mond= ay, April 9, 2012 9:48 AM=0A>>>>> =A0=A0Subject: Add client complexity or u= se a coprocessor?=0A>>>>> =0A>>>>> =A0=A0To whom it may concern,=0A>>>>> = =0A>>>>> =A0=A0Ignoring the complexities of gathering the data, assume that= I =0A> will be=0A>>>>> =A0=A0tracking millions of unique viewers. Updates = from each of our =0A> millions=0A>>>>> =A0=A0of clients are gathered in a c= entralized platform and spread =0A> among a=0A>>>>> =A0=A0group of machines= for processing and inserting into HBase =0A> (assume that=0A>>>>> =A0=A0th= is group can be scaled horizontally). The data is stored in =0A> an OLAP=0A= >>>>> =A0=A0cube format and one of the metrics I'm tracking across =0A> var= ious=0A>>>>> =A0=A0attributes is viewership (how many people from Y are wat= ching =0A> X).=0A>>>>> =0A>>>>> =A0=A0I'm writing this to ask for your thou= ghts as to the most=0A>>> appropriate=0A>>>>> =A0=A0way to structure my da= ta so I can count unique TV viewers =0A> (assume a=0A>>>>> =A0=A0service li= ke netflix or hulu).=0A>>>>> =0A>>>>> =A0=A0Here are the solutions I'm cons= idering:=0A>>>>> =0A>>>>> =A0=A01. Store each unique user ID as the cell na= me within the =0A> cube(s) it=0A>>>>> =A0=A0occurs. This has the advantage = of having 100% accuracy, but =0A> the=0A>>>>> =A0=A0downside is the enormou= s space required to store each unique =0A> cell.=0A>>>>> =A0=A0Consuming th= is data is also problematic as the only way to =0A> provide a=0A>>>>> =A0= =A0viewership count is by counting each cell. To save the =0A> overhead of= =0A>>>>> =A0=A0sending each cell over the network, counting them could be = =0A> done by a=0A>>>>> =A0=A0coprocessor on the region server, but that sti= ll doesn't =0A> avoid the=0A>>>>> =A0=A0overhead of reading each cell from = the disk. I'm also not =0A> sure what=0A>>>>> =A0=A0happens if a single row= is larger than an entire region (48 =0A> bytes per=0A>>>>> =A0=A0user ID *= 10,000,000 users =3D 480GB).=0A>>>>> =0A>>>>> =A0=A02. Store a byte array = that allows estimating unique viewers =0A> (with a=0A>>>>> =A0=A0small marg= in of error*). Add a co-processor for updating this =0A> column=0A>>>>> =A0= =A0so I can guarantee the updates to a specific OLAP cell will be =0A> atom= ic.=0A>>>>> =A0=A0The main benefit from this path is that there the nodes t= hat =0A> update=0A>>>>> =A0=A0HBase can be less complex. Another benefit I = see is that the I =0A> can=0A>>>>> =A0=A0just add more HBase regions as sca= le requires. However, =0A> I'm not=0A>>> sure=0A>>>>> =A0=A0if I can use a= coprocessor the way I want; Can I observe =0A> updates to a=0A>>>>> =A0=A0= particular table and replace the provided data with my own? =0A> (The=0A>>>= >> =A0=A0client calls "put" with the actual user ID, my =0A> co-processor= =0A>>> replaces=0A>>>>> =A0=A0it with a computed value, so the actual user= ID never gets =0A> stored in=0A>>>>> =A0=A0HBase).=0A>>>>> =0A>>>>> =A0=A0= 3. Store a byte array that allows estimating unique viewers =0A> (with a=0A= >>>>> =A0=A0small margin of error*). Re-arrange my architecture so that =0A= > each OLAP=0A>>>>> =A0=A0cell is only updated by a single node. The main b= enefit from =0A> this=0A>>>>> =A0=A0would be that I don't need to worry abo= ut atomic =0A> operations in=0A>>> HBase=0A>>>>> =A0=A0since all updates f= or a single cell will be atomic and in =0A> serial. The=0A>>>>> =A0=A0bigge= st downside is that I believe it will add significant =0A> complexity=0A>>>= >> =A0=A0to my overall architecture.=0A>>>>> =0A>>>>> =0A>>>>> =A0=A0Thanks= for your time, and I look forward to hearing your =0A> thoughts.=0A>>>>> = =0A>>>>> =A0=A0Sincerely,=0A>>>>> =A0=A0Tom Brown=0A>>>>> =0A>>>>> =A0=A0*(= For information about the byte array mentioned in #2 and #3, =0A> see:=0A>>= >>> =0A>>> =0A> http://highscalability.com/blog/2012/4/5/big-data-counting-= how-to-count-a-billion-distinct-objects-us.html)=0A>>>>> =0A>>> =0A>