hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gary Helmling <ghelml...@gmail.com>
Subject Re: Strore list
Date Tue, 10 Nov 2009 15:53:42 GMT
Julien,

In the first case (with a dedicated column family), if you are just storing
a list and not a map, and if each value is less than 32k, you could even use
the format:

list:value1 => null
list:value2 => null
list:value3 => null

In this case you would lose the original ordering of the list (the entries
would come back ordered lexicographically by value), but you could easily
remove entries if you know what values are being removed.

As J-D mentions, there are many options in the second case for serializing
your list to a byte[] to store.

If you describe a bit more how you plan on using this (what kind of data is
in the list and how is it being used), it might help us to offer more
targeted advice.

--gh


On Tue, Nov 10, 2009 at 10:30 AM, Jean-Daniel Cryans <jdcryans@apache.org>wrote:

> I'm not sure exactly what you described here.
>
> In the first case I meant that a column family named "list:" could
> hold values in this way:
>
> list:key1 => value1
> list:key2 => value2
> list:key3 => value3
> list:key4 => value4
>
> In the second case let's say you have a family called "default:" and
> you would store the list like this:
>
> default:some_list => { value1, value2, value3, value4 }
>
> Here I didn't use keys because you said you wanted to store a list and
> not a map.
>
> The first case let's you have atomic access to each value so you can
> update any value by use the qualifier. The second case let's you grab
> the data all at once but you have to save a new list everytime you
> change something. The best solution for you will depend on your
> requirements.
>
> J-D
>
>
> On Tue, Nov 10, 2009 at 7:07 AM, Julien Ruchaud
> <julien.ruchaud@codelutin.com> wrote:
> > Le Tue, 10 Nov 2009 06:50:46 -0800,
> > Jean-Daniel Cryans <jdcryans@apache.org> a écrit :
> >
> >> Julien,
> >>
> >> Depends on your usage pattern. For example a list with 100,000+ values
> >> where you only access one at a time (like to do joins) should be
> >> stored in family where the qualifier is the key and the value is
> >> either again the key or something more useful.
> >>
> >> If you have small lists that you use as a whole, you can consider
> >> serializing them in a single cell with something like JSON, YAML,
> >> protobuf, etc.
> >
> > In fact, I can need both cases. My list can contain 3 or 1,000,000
> > values...
> >
> > What would be the best compromise ?
> >
> > In the first case, do you mean storing values like this ?
> >
> > a) Is it possible something like that ?
> > key = value 1
> > key = value 2
> >
> > b) Difficult to maintain (update, delete an element, ...)
> > key[1] = value 1
> > key[2] = value 2
> >
> > c) ??
> > key1 = value 1, key2
> > key2 = value 2
> >
> > d) other ...
> >
> > Julien
> >
> >>
> >> J-D
> >>
> >> On Tue, Nov 10, 2009 at 1:41 AM, Julien Ruchaud
> >> <julien.ruchaud@codelutin.com> wrote:
> >> > Hi all,
> >> >
> >> > How to store a list with hbase ? Do I have to serialize list to
> >> > bytes before storing or shall I put multi-values in colunm ? What
> >> > is the best way ?
> >> >
> >> > Thanks,
> >> >
> >> > Julien
> >> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message