Return-Path: X-Original-To: apmail-hbase-dev-archive@www.apache.org Delivered-To: apmail-hbase-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 4B34BE0F4 for ; Fri, 22 Feb 2013 14:21:57 +0000 (UTC) Received: (qmail 66310 invoked by uid 500); 22 Feb 2013 14:21:56 -0000 Delivered-To: apmail-hbase-dev-archive@hbase.apache.org Received: (qmail 65722 invoked by uid 500); 22 Feb 2013 14:21:52 -0000 Mailing-List: contact dev-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hbase.apache.org Delivered-To: mailing list dev@hbase.apache.org Received: (qmail 65692 invoked by uid 99); 22 Feb 2013 14:21:51 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 22 Feb 2013 14:21:51 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of yuzhihong@gmail.com designates 209.85.160.41 as permitted sender) Received: from [209.85.160.41] (HELO mail-pb0-f41.google.com) (209.85.160.41) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 22 Feb 2013 14:21:44 +0000 Received: by mail-pb0-f41.google.com with SMTP id um15so437091pbc.14 for ; Fri, 22 Feb 2013 06:21:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:references:mime-version:in-reply-to:content-type :content-transfer-encoding:message-id:cc:x-mailer:from:subject:date :to; bh=ANwLdaZyP4Rj+T+tiqgB5TjVu4Lilgo/JA6oLhRyFJ4=; b=dMmtRepxaDN9okbFuxpKxTy76g3ohamctEHciYSjV0qOXgJ43MTAe2JFLTDnEDAfgc uP4+26mh3ri3bjRzUJi0+aEMEO47shyz9T8+/EccjLjjK67R8KqPOqf3rGcFB3AHXryb J6PGBnTuqUMcWVYtX6PUZWzweTHxC5pNodCG+XqKJ/hIOO+H80SXSWJtk1TOPtMfjxXi 56ZpXFhW45u27AzzcrytvkwRiDmH5hGqoxzE0Mp/LDvZNdZzPthEdAP8ie4nzWUoxQjV VQ/ylTQk7F+bJ+ykXpIqrgOjzqCUYG3GCyO27vdRhQHt8/AnRb5zpbRrP+RYBPhdffR5 yW2Q== X-Received: by 10.68.11.35 with SMTP id n3mr3225898pbb.220.1361542883188; Fri, 22 Feb 2013 06:21:23 -0800 (PST) Received: from [192.168.0.13] (c-24-130-233-55.hsd1.ca.comcast.net. [24.130.233.55]) by mx.google.com with ESMTPS id tm1sm2672928pbc.11.2013.02.22.06.21.20 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Fri, 22 Feb 2013 06:21:22 -0800 (PST) References: Mime-Version: 1.0 (1.0) In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Message-Id: <6873684E-D50C-4A4A-8B5D-94342E4D7897@gmail.com> Cc: "dev@hbase.apache.org" X-Mailer: iPhone Mail (10A525) From: Ted Subject: Re: Review request for HBASE-7692: Ordered byte[] serialization Date: Fri, 22 Feb 2013 06:21:15 -0800 To: "dev@hbase.apache.org" X-Virus-Checked: Checked by ClamAV on apache.org Elliot is working on making hbase-client module concrete in hbase-7012.=20 Cheers On Feb 22, 2013, at 6:13 AM, Nick Dimiduk wrote: > You're absolutely correct: this library introduces client-side conventions= > and is not needed from within the HMaster or RegionServer. Is > the consensus that it should reside in it's own module or be a sibling to > the o.a.h.hbase.client source tree? I'm a little confused by the current > state of the modules; hbase-client looks empty while o.a.h.hbase.client > sits under hbase-server. >=20 > Thanks, > Nick >=20 > On Thu, Feb 21, 2013 at 11:56 PM, Jonathan Hsieh wrote:= >=20 >> So I buy the argument about this being included in hbase, but several of >> the questions still stand -- >>=20 >> Why is this part of hbase-common? shouldn't this be just a dependency of= >> hbase-client module? Does the hbase-server side need to depend on this? >>=20 >> Since this is a large import of a currently isolated library, why not mak= e >> it a separate module instead of part of hbase-common? This would enforce= a >> boundary that will prevent pollution from circular dependencies. >>=20 >> Jon. >>=20 >> On Thu, Feb 21, 2013 at 7:23 PM, Enis S=C3=B6ztutar wro= te: >>=20 >>> I think this belongs in core HBase, as a replacement to Bytes, which >> should >>> be deprecated eventually. We have a Bytes utility which is supposed to >>> convert basic java types to byte[]'s, but it does not work for signed >>> numbers. >>>=20 >>> We already know that all of the clients, Hive, Pig, Phoenix, have to hav= e >>> at least java type -> byte[] conversion utilities, and I think it is >>> HBase's job to supply one so that different clients can interoperate. >> Since >>> internally we are also relying on serializing java types, we need that >>> library in the core. >>>=20 >>> BTW, I also think that we need to have a SQL-type to java type to byte[]= >>> layer, but that is another discussion. >>>=20 >>> Enis >>>=20 >>>=20 >>> On Thu, Feb 21, 2013 at 3:04 PM, Jonathan Hsieh >> wrote: >>>=20 >>>> Nick, >>>>=20 >>>> While I believe having an order-preserving canonical serialization is a= >>>> good idea, from doing a read of the mail and a skim of the jira it is >>> not >>>> clear to my why this is inside hbase as part of hbase-common. >>>>=20 >>>> Why isn't this part of a library on top of hbase (a dependency for >>>> Pig/Hive) instead of "inside" hbase? >>>> Can't this functionality be done just from the client level? >>>> What's the end goal hee? Is the goal here to replace the >> Bytes.toBytes(*) >>>> methods to enforced the ordering? >>>> If I HBase has two mutually incompatible encodings "built-in", how >> does a >>>> dev know to use one or the other later on? >>>> If this is essentially a mega import of a library (300k.. yikes) , why >>> not >>>> make it a separate module instead of part of common? >>>>=20 >>>> Jon. >>>>=20 >>>> On Thu, Feb 21, 2013 at 10:35 AM, Nick Dimiduk >>> wrote: >>>>=20 >>>>> Hi everyone, >>>>>=20 >>>>> I'm of the opinion that HBase should provide a mechanism for >>> serializing >>>>> common java types such that the serialized format sorts according the >>>>> the natural ordering of the type. I think many application efforts >> end >>> up >>>>> building a custom, partial implementation of this kind of >> functionality >>>> on >>>>> their own. I think HBase should provide a canonical implementation of >>>> such >>>>> a serialization format so that third-parties can reliably build on >> top >>> of >>>>> HBase. Not just user applications, but other tools like Pig and Hive >>> are >>>>> also enabled. Implementations for >>>>> HIVE-3634, >>>>> HIVE-2599 , or >>>>> HIVE-2903could be >>>>> compatible with similar features in Pig. >>>>>=20 >>>>> After implementing something similar on multiple occasions, stumbled >>>> across >>>>> the Orderly library. It's also >>>>> appears to have been adopted by other large projects, including >>>>> Lily. >>>>> I've engaged the library's author for some improvements only to find >>> out >>>>> he's now at Google and will no longer be maintaining it. Thus, I >>> propose >>>> we >>>>> take it into HBase. >>>>>=20 >>>>> HBASE-7692 >>> includes a >>>>> patch that introduces Orderly into hbase-common under the orderly >>>>> namespace. I have an associated branch on >>>>> gihub< >>>> https://github.com/ndimiduk/hbase/commits/7692-ordered-serialization >>>>>> wherein >>>>> I've broken the patch out into multiple commits to ease review. >>>>> Please take a few minutes to give it a look. >>>>>=20 >>>>> Thanks, >>>>> Nick >>>>=20 >>>>=20 >>>>=20 >>>> -- >>>> // Jonathan Hsieh (shay) >>>> // Software Engineer, Cloudera >>>> // jon@cloudera.com >>=20 >>=20 >>=20 >> -- >> // Jonathan Hsieh (shay) >> // Software Engineer, Cloudera >> // jon@cloudera.com >>=20