Return-Path: X-Original-To: apmail-hbase-user-archive@www.apache.org Delivered-To: apmail-hbase-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 4A15D11FC6 for ; Wed, 10 Sep 2014 19:41:06 +0000 (UTC) Received: (qmail 55390 invoked by uid 500); 10 Sep 2014 19:41:04 -0000 Delivered-To: apmail-hbase-user-archive@hbase.apache.org Received: (qmail 55316 invoked by uid 500); 10 Sep 2014 19:41:04 -0000 Mailing-List: contact user-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hbase.apache.org Delivered-To: mailing list user@hbase.apache.org Received: (qmail 55303 invoked by uid 99); 10 Sep 2014 19:41:03 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 10 Sep 2014 19:41:03 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of javadba@gmail.com designates 209.85.213.179 as permitted sender) Received: from [209.85.213.179] (HELO mail-ig0-f179.google.com) (209.85.213.179) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 10 Sep 2014 19:40:59 +0000 Received: by mail-ig0-f179.google.com with SMTP id r2so6963665igi.12 for ; Wed, 10 Sep 2014 12:40:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=aSMPD5/1aYM+IwdllLC7s7909dmUK1WbQGbuCQyXU80=; b=fIW0+ew5kKA8n3fsKVjNbmAvOCKf1mFHS7lhkMlkVIrMNlFwjUx3yOKHZ8PvBElYc4 121+vhI/ef75qGhv8jNCBZkfCFb54YOMuO+b4EERvQIP67llJUJOL31miFM6i1pXeNKq BBvDcWeVw7o8RliczPfMNp3oN8EFG89Fd0LR0W732zMHd+NumKYgFjln15n7jzTDObif FSjCq6CM/OBYcGsJXj9d0nZj1u8TaPjpF9raj9mAtgeba6xLCqkXdd++KNSrdmf1d4rS jCz4Xxa7Y5lO5SSwCcSu6+nyUx5kiJyajpDAocYwXVaMw4MXHQ3CjkpH/GeXgDej/B91 qoAA== MIME-Version: 1.0 X-Received: by 10.43.74.5 with SMTP id yu5mr3470885icb.84.1410378038650; Wed, 10 Sep 2014 12:40:38 -0700 (PDT) Received: by 10.107.169.229 with HTTP; Wed, 10 Sep 2014 12:40:38 -0700 (PDT) In-Reply-To: References: Date: Wed, 10 Sep 2014 12:40:38 -0700 Message-ID: Subject: Re: Nested data structures examples for HBase From: Stephen Boesch To: user Content-Type: multipart/alternative; boundary=001a11c1e856fe2f210502bb397b X-Virus-Checked: Checked by ClamAV on apache.org --001a11c1e856fe2f210502bb397b Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Thanks Sean. We have some internal requirements that lead us to most likely need to stick with native HBase API's. But the suggestion is still appreciated - I was not aware of that project. 2014-09-10 12:09 GMT-07:00 Sean Busbey : > Hi Stephen! > > Have you taken a look at Apache Gora? It uses Avro for its data model, > which supports nested data structures, and can store in a variety of > backing stores, including HBase. > > -Sean > > On Tue, Sep 9, 2014 at 4:20 PM, Stephen Boesch wrote: > > > Thanks Michael, yes cells are byte[]; therefore, storing JSON or other > > document structures is always possible. Our use cases include querying > > individual elements in the structure - so that would require > reconstituting > > the documents and then parsing them for every row. We probably are not > > headed in the direction of HBase for those use cases: but we are trying > to > > make that determination after having carefully considered the extent of > the > > mismatch. > > > > 2014-09-09 13:37 GMT-07:00 Michael Segel : > > > > > You do realize that everything you store in Hbase are byte arrays, > right? > > > That is each cell is a blob. > > > > > > So you have the ability to create nested structures like=E2=80=A6 JSO= N records? > > ;-) > > > > > > So to your point. You can have a column A which represents a set of > > values. > > > > > > This is one reason why you shouldn=E2=80=99t think of HBase in terms = of being > > > relational. In fact for Hadoop, you really don=E2=80=99t want to thin= k in terms > > of > > > relational structures. > > > Think more of Hierarchical. > > > > > > So yes, you can do what you want to do=E2=80=A6 > > > > > > HTH > > > > > > -Mike > > > > > > On Sep 8, 2014, at 10:06 PM, Stephen Boesch wrote= : > > > > > > > While I am aware that HBase does not have native support for nested > > > > structures, surely there are some of you that have thought through > this > > > use > > > > case carefully. > > > > > > > > Our particular use case is likely having single digit nested layers > > with > > > > tens to hundreds of items in the lists at each level. > > > > > > > > An example would be a > > > > > > > > top Level 300 items > > > > middle level : 1 to 100 items ("1 value" may indicate a single > value > > > as > > > > opposed to a list) > > > > third level: 1 to 50 items > > > > fourth level 1 to 20 items > > > > > > > > The column names are likely known ahead of time- which may or may n= ot > > > > matter for hbase. We could model the above structure in a Parquet > File > > > or > > > > in Hive (with nested struct's)- but we would like to consider wheth= er > > > > HBase.might also be an option. > > > > > > > > > > > > -- > Sean > --001a11c1e856fe2f210502bb397b--