hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stephen Boesch <java...@gmail.com>
Subject Re: Nested data structures examples for HBase
Date Wed, 10 Sep 2014 07:15:09 GMT
Hi Wilm
 that is actually an interesting option - include the entire json-path in
the cq

2014-09-09 23:17 GMT-07:00 Wilm Schumacher <wilm.schumacher@cawoom.com>:

> as stated above you can use JSON or something similar, which is always
> possible. However, if you have to do that very often (and I think you
> are, if you using hbase ;) ), this could be a bad plan, because parsing
> JSON is expensive in terms of CPU.
>
> As I am relativly new to hbase (using it perhaps for a year and not
> using most of the fancy features) perhaps my suggestion is not clever
> ... but why not using hbase directly?
>
> If your structure is something like
>
> {
>         A : "A"
>         B : {
>                 B1 : "B1" ,
>                 B2 : "B2"
>         }
> }
>
> why not using qualifiers like "data:B,B1" where "data" is your column
> family?
>
> Your explaination of your problem seems to fit this idea perfectly, as
> you are not interested in JSON like behaviour (requesting B => getting
> "{B1: "B1" , B2 : "B2"}"), but like having a defined structure (fixed
> number of layers etc.).
>
> So if you want to query "B=>B2", just adding "B,B2" as qualifier to the
> get request and fire?
>
> This is of course only possible if the queried names are known. If not
> you have to query the whole column family, which could get very big
> regarding your requirements below ... but still would be possible.
>
> However, by using a "," as seperator, just as an example, the parsing of
> the object to whatever you need should be very simple. however, as you
> stated, that you just want to write stuff and query it directly even
> this cheap parsing shouldn't be required.
>
> This sounds much more easy and much cheaper regarding CPU usage to me
> than the JSON, XML, whatever plan.
>
> Do I misunderstood your problem completely? Or does the above outlined
> plan has flaws (as question to the hbase experts)?
>
> Best wishes,
>
> Wilm
>
> Am 08.09.2014 um 23:06 schrieb Stephen Boesch:
> > While I am aware that HBase does not have native support for nested
> > structures, surely there are some of you that have thought through this
> use
> > case carefully.
> >
> > Our particular use case is likely having single digit nested layers with
> > tens to hundreds of items in the lists at each level.
> >
> > An example would be a
> >
> >  top Level  300 items
> >  middle level :  1 to 100 items  ("1 value"  may indicate a single value
> as
> > opposed to a list)
> >  third level:  1 to 50 items
> >  fourth level  1 to 20 items
> >
> > The column names are likely known ahead of time- which may or may not
> > matter for hbase.  We could model the above structure in a Parquet File
> or
> > in Hive (with nested struct's)- but we would like to consider whether
> > HBase.might also be an option.
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message