Return-Path: X-Original-To: apmail-hbase-dev-archive@www.apache.org Delivered-To: apmail-hbase-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 63E5FE246 for ; Fri, 22 Feb 2013 07:57:06 +0000 (UTC) Received: (qmail 9690 invoked by uid 500); 22 Feb 2013 07:57:05 -0000 Delivered-To: apmail-hbase-dev-archive@hbase.apache.org Received: (qmail 9393 invoked by uid 500); 22 Feb 2013 07:57:05 -0000 Mailing-List: contact dev-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hbase.apache.org Delivered-To: mailing list dev@hbase.apache.org Received: (qmail 9340 invoked by uid 99); 22 Feb 2013 07:57:03 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 22 Feb 2013 07:57:03 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of jon@cloudera.com designates 209.85.128.182 as permitted sender) Received: from [209.85.128.182] (HELO mail-ve0-f182.google.com) (209.85.128.182) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 22 Feb 2013 07:56:59 +0000 Received: by mail-ve0-f182.google.com with SMTP id ox1so310932veb.13 for ; Thu, 21 Feb 2013 23:56:38 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-received:mime-version:in-reply-to:references:from:date:message-id :subject:to:content-type:x-gm-message-state; bh=buo2vxrd24NgvzYYuMYmiLtbR7y40mfqsM0FvABkeic=; b=D1YeScRwDlw+Ym6egUKrFApqVHaRaUwaps2zYqQ/3TFlb2nJ/y+btj62bvUKku9ejb YHFyE2eE8Z1IbtwCVCW+jo7Jnzvem8ilcBYTbYS+wC/VeBITgaO7CH0g0NEOVP1alcOt Pz51CiTAAfjqEIPizkDGCGNKIj9vmjV9b7qZxOVPLaQh3+RFQXYHF+io7Bwe53m12m0X vX1RYynuFD3VusL9kQvHw5DBfNNQjb7yI6lrQy1mqJCsc2TFZLcb4zPAe7AzNU1ogKnQ p5NF0hEonlcwnQnLmxJ0XZtVSS7U9iLswb8cLFr6S1TgOb0zS6JdbXVHgtu5qBhJYDnJ LRPw== X-Received: by 10.58.132.170 with SMTP id ov10mr1217604veb.57.1361519798290; Thu, 21 Feb 2013 23:56:38 -0800 (PST) MIME-Version: 1.0 Received: by 10.59.7.129 with HTTP; Thu, 21 Feb 2013 23:56:18 -0800 (PST) In-Reply-To: References: From: Jonathan Hsieh Date: Thu, 21 Feb 2013 23:56:18 -0800 Message-ID: Subject: Re: Review request for HBASE-7692: Ordered byte[] serialization To: dev@hbase.apache.org Content-Type: multipart/alternative; boundary=047d7b6da8f8eedc1204d64b87e2 X-Gm-Message-State: ALoCoQnzTG8XNWLAPAVe7Ny2uw7Z3/pQ4nkUQbvk08EL+xyZBLe1P5xSR75mTvyoVxWRZbcA8xMm X-Virus-Checked: Checked by ClamAV on apache.org --047d7b6da8f8eedc1204d64b87e2 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable So I buy the argument about this being included in hbase, but several of the questions still stand -- Why is this part of hbase-common? shouldn't this be just a dependency of hbase-client module? Does the hbase-server side need to depend on this? Since this is a large import of a currently isolated library, why not make it a separate module instead of part of hbase-common? This would enforce a boundary that will prevent pollution from circular dependencies. Jon. On Thu, Feb 21, 2013 at 7:23 PM, Enis S=F6ztutar wrote: > I think this belongs in core HBase, as a replacement to Bytes, which shou= ld > be deprecated eventually. We have a Bytes utility which is supposed to > convert basic java types to byte[]'s, but it does not work for signed > numbers. > > We already know that all of the clients, Hive, Pig, Phoenix, have to have > at least java type -> byte[] conversion utilities, and I think it is > HBase's job to supply one so that different clients can interoperate. Sin= ce > internally we are also relying on serializing java types, we need that > library in the core. > > BTW, I also think that we need to have a SQL-type to java type to byte[] > layer, but that is another discussion. > > Enis > > > On Thu, Feb 21, 2013 at 3:04 PM, Jonathan Hsieh wrote: > > > Nick, > > > > While I believe having an order-preserving canonical serialization is a > > good idea, from doing a read of the mail and a skim of the jira it is > not > > clear to my why this is inside hbase as part of hbase-common. > > > > Why isn't this part of a library on top of hbase (a dependency for > > Pig/Hive) instead of "inside" hbase? > > Can't this functionality be done just from the client level? > > What's the end goal hee? Is the goal here to replace the Bytes.toBytes(= *) > > methods to enforced the ordering? > > If I HBase has two mutually incompatible encodings "built-in", how does= a > > dev know to use one or the other later on? > > If this is essentially a mega import of a library (300k.. yikes) , why > not > > make it a separate module instead of part of common? > > > > Jon. > > > > On Thu, Feb 21, 2013 at 10:35 AM, Nick Dimiduk > wrote: > > > > > Hi everyone, > > > > > > I'm of the opinion that HBase should provide a mechanism for > serializing > > > common java types such that the serialized format sorts according the > > > the natural ordering of the type. I think many application efforts en= d > up > > > building a custom, partial implementation of this kind of functionali= ty > > on > > > their own. I think HBase should provide a canonical implementation of > > such > > > a serialization format so that third-parties can reliably build on to= p > of > > > HBase. Not just user applications, but other tools like Pig and Hive > are > > > also enabled. Implementations for > > > HIVE-3634, > > > HIVE-2599 , or > > > HIVE-2903could be > > > compatible with similar features in Pig. > > > > > > After implementing something similar on multiple occasions, stumbled > > across > > > the Orderly library. It's also > > > appears to have been adopted by other large projects, including > > > Lily. > > > I've engaged the library's author for some improvements only to find > out > > > he's now at Google and will no longer be maintaining it. Thus, I > propose > > we > > > take it into HBase. > > > > > > HBASE-7692 > includes a > > > patch that introduces Orderly into hbase-common under the orderly > > > namespace. I have an associated branch on > > > gihub< > > https://github.com/ndimiduk/hbase/commits/7692-ordered-serialization > > > >wherein > > > I've broken the patch out into multiple commits to ease review. > > > Please take a few minutes to give it a look. > > > > > > Thanks, > > > Nick > > > > > > > > > > > -- > > // Jonathan Hsieh (shay) > > // Software Engineer, Cloudera > > // jon@cloudera.com > > > --=20 // Jonathan Hsieh (shay) // Software Engineer, Cloudera // jon@cloudera.com --047d7b6da8f8eedc1204d64b87e2--