Return-Path: Delivered-To: apmail-hadoop-hbase-user-archive@minotaur.apache.org Received: (qmail 64426 invoked from network); 10 Nov 2009 15:54:13 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 10 Nov 2009 15:54:13 -0000 Received: (qmail 84119 invoked by uid 500); 10 Nov 2009 15:54:13 -0000 Delivered-To: apmail-hadoop-hbase-user-archive@hadoop.apache.org Received: (qmail 84072 invoked by uid 500); 10 Nov 2009 15:54:12 -0000 Mailing-List: contact hbase-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hbase-user@hadoop.apache.org Delivered-To: mailing list hbase-user@hadoop.apache.org Received: (qmail 84062 invoked by uid 99); 10 Nov 2009 15:54:12 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 10 Nov 2009 15:54:12 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of ghelmling@gmail.com designates 209.85.223.188 as permitted sender) Received: from [209.85.223.188] (HELO mail-iw0-f188.google.com) (209.85.223.188) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 10 Nov 2009 15:54:04 +0000 Received: by iwn26 with SMTP id 26so127670iwn.5 for ; Tue, 10 Nov 2009 07:53:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type; bh=U43EikPMC3MHNm0nAsVcJd3EH20MS2Iw6uMJ8d2Ag/c=; b=e5tQ3or+A4K6a34TE/QdNZWxkOzFXEqersbyh0qpbkWBxG3CqamDUw0NeDHPo14mfj h5/kx7SBFofg2VbmHEkeX2BzntFo2bEaeaFfZWqA/yn1K2b+EkzQCR7EYqTlr5rqSuiO GswPDTL96dZA2QvTI8EWN7fuHnLEFh8kg8XVY= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=RNGjqrlic8Xo/Wz+xTcEAZIlCNpQ0wL71tB75Tq0GU6P+PYoxfGeq+0UHvevQM0wuV pxNm93HcjEcUVy/Akaq35Gv4RvOmlgWXH+ZiGmGoC0CqYJmDWqP9w7Nu8QluLQbswPYF fQbJ+QxnmV6o61D8lPXcuwZ+4kC2a/nC8d55U= MIME-Version: 1.0 Received: by 10.231.10.16 with SMTP id n16mr399869ibn.24.1257868422970; Tue, 10 Nov 2009 07:53:42 -0800 (PST) In-Reply-To: <31a243e70911100730n4dcec092t669a8bcf949145a0@mail.gmail.com> References: <20091110104115.4db9d4e0@eleonore.codelutin.home> <31a243e70911100650p150645b2p4002cb037f7ffa12@mail.gmail.com> <20091110160729.38f716b3@eleonore.codelutin.home> <31a243e70911100730n4dcec092t669a8bcf949145a0@mail.gmail.com> Date: Tue, 10 Nov 2009 10:53:42 -0500 Message-ID: Subject: Re: Strore list From: Gary Helmling To: hbase-user@hadoop.apache.org Content-Type: multipart/alternative; boundary=000325575d12867f5304780650d9 X-Virus-Checked: Checked by ClamAV on apache.org --000325575d12867f5304780650d9 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Julien, In the first case (with a dedicated column family), if you are just storing a list and not a map, and if each value is less than 32k, you could even us= e the format: list:value1 =3D> null list:value2 =3D> null list:value3 =3D> null In this case you would lose the original ordering of the list (the entries would come back ordered lexicographically by value), but you could easily remove entries if you know what values are being removed. As J-D mentions, there are many options in the second case for serializing your list to a byte[] to store. If you describe a bit more how you plan on using this (what kind of data is in the list and how is it being used), it might help us to offer more targeted advice. --gh On Tue, Nov 10, 2009 at 10:30 AM, Jean-Daniel Cryans w= rote: > I'm not sure exactly what you described here. > > In the first case I meant that a column family named "list:" could > hold values in this way: > > list:key1 =3D> value1 > list:key2 =3D> value2 > list:key3 =3D> value3 > list:key4 =3D> value4 > > In the second case let's say you have a family called "default:" and > you would store the list like this: > > default:some_list =3D> { value1, value2, value3, value4 } > > Here I didn't use keys because you said you wanted to store a list and > not a map. > > The first case let's you have atomic access to each value so you can > update any value by use the qualifier. The second case let's you grab > the data all at once but you have to save a new list everytime you > change something. The best solution for you will depend on your > requirements. > > J-D > > > On Tue, Nov 10, 2009 at 7:07 AM, Julien Ruchaud > wrote: > > Le Tue, 10 Nov 2009 06:50:46 -0800, > > Jean-Daniel Cryans a =E9crit : > > > >> Julien, > >> > >> Depends on your usage pattern. For example a list with 100,000+ values > >> where you only access one at a time (like to do joins) should be > >> stored in family where the qualifier is the key and the value is > >> either again the key or something more useful. > >> > >> If you have small lists that you use as a whole, you can consider > >> serializing them in a single cell with something like JSON, YAML, > >> protobuf, etc. > > > > In fact, I can need both cases. My list can contain 3 or 1,000,000 > > values... > > > > What would be the best compromise ? > > > > In the first case, do you mean storing values like this ? > > > > a) Is it possible something like that ? > > key =3D value 1 > > key =3D value 2 > > > > b) Difficult to maintain (update, delete an element, ...) > > key[1] =3D value 1 > > key[2] =3D value 2 > > > > c) ?? > > key1 =3D value 1, key2 > > key2 =3D value 2 > > > > d) other ... > > > > Julien > > > >> > >> J-D > >> > >> On Tue, Nov 10, 2009 at 1:41 AM, Julien Ruchaud > >> wrote: > >> > Hi all, > >> > > >> > How to store a list with hbase ? Do I have to serialize list to > >> > bytes before storing or shall I put multi-values in colunm ? What > >> > is the best way ? > >> > > >> > Thanks, > >> > > >> > Julien > >> > > > > --000325575d12867f5304780650d9--