Return-Path: X-Original-To: apmail-crunch-dev-archive@www.apache.org Delivered-To: apmail-crunch-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 43BA810587 for ; Wed, 16 Oct 2013 14:35:10 +0000 (UTC) Received: (qmail 48952 invoked by uid 500); 16 Oct 2013 14:35:09 -0000 Delivered-To: apmail-crunch-dev-archive@crunch.apache.org Received: (qmail 48920 invoked by uid 500); 16 Oct 2013 14:35:05 -0000 Mailing-List: contact dev-help@crunch.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@crunch.apache.org Delivered-To: mailing list dev@crunch.apache.org Received: (qmail 48912 invoked by uid 99); 16 Oct 2013 14:35:03 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 16 Oct 2013 14:35:03 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of jwills@cloudera.com designates 209.85.216.170 as permitted sender) Received: from [209.85.216.170] (HELO mail-qc0-f170.google.com) (209.85.216.170) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 16 Oct 2013 14:34:59 +0000 Received: by mail-qc0-f170.google.com with SMTP id n9so564223qcw.15 for ; Wed, 16 Oct 2013 07:34:38 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:content-type; bh=wxWPlnXBGQyFdEJyw2N5QNqISQ0p3GEWnoGQzdaHz2w=; b=WlCq41dX161Iv8wGbkWX0cnSB686MM+B8q0n3fE4IMgCBZR3SrJXGfy/u2ubZnSHTp C7F7p3kEdpLgFOemJqFzrfTPRMR9pYhfhXaN4KJgb68BMHuLcMA7Yuf3lSXXgJml0QEy 6KTj8M7ZIeA99GnrIpf34Id+r1VlT4WN7v9HY+Lrkn4SXh2jXHD5MsDpqQR7KNWE4EY3 7YhRBsWRJeLXLMUiGHtQYDGY1mIiyNN/Yq4F/AaBu0C8BcVmOJ0wDKEkAJNJW4kcZoV0 K3QLS15sMECFYEOR42VoIESeeHQCmuEnvOQnla+Tj5ldbvfDmkVQb5J4O/K2R4/y/s47 LhWg== X-Gm-Message-State: ALoCoQly9MZTejcMZ/sqFODRGhBczd21i8+dPiRfhxUdjUyS8MeAPW6XphjBUv2wTiPR+g0D+oGo X-Received: by 10.224.156.67 with SMTP id v3mr5085215qaw.106.1381934078432; Wed, 16 Oct 2013 07:34:38 -0700 (PDT) MIME-Version: 1.0 Received: by 10.224.33.79 with HTTP; Wed, 16 Oct 2013 07:34:18 -0700 (PDT) In-Reply-To: References: From: Josh Wills Date: Wed, 16 Oct 2013 07:34:18 -0700 Message-ID: Subject: Re: Thoughts on supporting HBase 0.96 To: dev Content-Type: multipart/alternative; boundary=089e0158b094d92baa04e8dc99fb X-Virus-Checked: Checked by ClamAV on apache.org --089e0158b094d92baa04e8dc99fb Content-Type: text/plain; charset=ISO-8859-1 On Wed, Oct 16, 2013 at 12:15 AM, Gabriel Reid wrote: > On Wed, Oct 16, 2013 at 8:46 AM, Josh Wills wrote: > > > On Tue, Oct 15, 2013 at 11:42 PM, Chao Shi wrote: > > > > > I don't understand why needs another PTypeFamily here. I think we can > > > simply provide some pre-defined PTypes. > > > > > > interface HFilePTypes { > > > static KEY_VALUE_PTYPE = xxx > > > static PUT_PTYPE = xxx > > > } > > > > > > > Technically, every PType has to provide an implementation of the > > PTypeFamily getFamily() method-- even if it's just returning a dummy > > object. > > > > > Wouldn't a derived PType (like in o.a.c.types.PTypes) be a better fit here? > That was my initial attempt, and in an ideal world, my preferred solution-- but I haven't figured out how to make it work. The question here is: what do I derive a KeyValue object to? What I really want, for purposes of reading it/writing it to one of our HBase IO formats, is to map it to itself, and not some subclass of Writable. Another option might be an extension of WritableType to handle these special case formats-- I'll take a crack at getting that to work. > A whole new PTypeFamily sounds like a lot of work (unless maybe if it was a > subclass of one of the existing ones), and I think there's still a fair bit > of code > that assumes that Avro & Writable are the only two possible PTypeFamily > implementations. > For any kind of intermediate processing, that is still true. The HBaseTypeFamily would only ever really appear at the input or output for a job. > > - Gabriel > -- Director of Data Science Cloudera Twitter: @josh_wills --089e0158b094d92baa04e8dc99fb--