Return-Path: Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: (qmail 1728 invoked from network); 22 Mar 2011 16:32:11 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 22 Mar 2011 16:32:11 -0000 Received: (qmail 32281 invoked by uid 500); 22 Mar 2011 16:32:09 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 32226 invoked by uid 500); 22 Mar 2011 16:32:08 -0000 Mailing-List: contact common-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-user@hadoop.apache.org Delivered-To: mailing list common-user@hadoop.apache.org Received: (qmail 32218 invoked by uid 99); 22 Mar 2011 16:32:08 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 22 Mar 2011 16:32:08 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of weishung@gmail.com designates 74.125.82.176 as permitted sender) Received: from [74.125.82.176] (HELO mail-wy0-f176.google.com) (74.125.82.176) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 22 Mar 2011 16:32:01 +0000 Received: by wyb40 with SMTP id 40so8922444wyb.35 for ; Tue, 22 Mar 2011 09:31:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=RGlohQITtSJDYvGUTzAeD3fjQXwklewLj2n30rtzeuE=; b=X4WmMWuW/PD5ZwBjXpdWwNbvrWBOLWUyms+HyuaWfZk2VPlqKxunuHVWxcBqXIF1Vo ilcngoSb9o9cn5OXbxU4NhaLGblDb40QtXA8ALfPI1wnj1MVwyd7o2TXoTyG15DG2XJR xDxPtr1buzIuQO4RC73J/YcJO0E6PrwdUjick= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=Yr2XJTo2xoFCvTMe3V18mVlnL/rtohaXeP4alN18tYPtHNUA/bH/dOni73NfnreXwi CJOQ9U9hRr3yuEAL0yh564fyVkwsHn/MgYejk1f7fEPGQI9SfgPfRLvmj1dmmUSy8X+U HiDJ7PbERF4s29cvfDVfXoceFG9nZqlqqraCM= MIME-Version: 1.0 Received: by 10.216.158.21 with SMTP id p21mr6326367wek.99.1300811500994; Tue, 22 Mar 2011 09:31:40 -0700 (PDT) Received: by 10.216.245.79 with HTTP; Tue, 22 Mar 2011 09:31:40 -0700 (PDT) In-Reply-To: References: <4D877FD5.5010501@apache.org> Date: Tue, 22 Mar 2011 11:31:40 -0500 Message-ID: Subject: Re: File formats in Hadoop From: Weishung Chung To: Vivek Krishna Cc: user@hbase.apache.org, common-user@hadoop.apache.org, qwertymaniac@gmail.com, Doug Cutting Content-Type: multipart/alternative; boundary=0016364ef9946fc205049f14c7d3 X-Virus-Checked: Checked by ClamAV on apache.org --0016364ef9946fc205049f14c7d3 Content-Type: text/plain; charset=ISO-8859-1 I also found this informative article http://cloudepr.blogspot.com/2009/09/hfile-block-indexed-file-format-to.html is the key value pair be eg column family1 with one qualifier 1 with 2 versions key1 : rowkey1+column family1:qualifier1+timestamp1 value1: corresponding cell value1 key2 : rowkey1+column family1:qualifier1+timestamp2 value2: corresponding cell value 2 key3: rowkey2+column family1:qualifier1+timestamp1 value3: corresponding cell value 3 On Tue, Mar 22, 2011 at 10:58 AM, Vivek Krishna wrote: > http://nosql.mypopescu.com/post/3220921756/hbase-internals-hfile-explained > might help. > > Viv > > > > > On Tue, Mar 22, 2011 at 11:43 AM, Weishung Chung wrote: > >> My fellow superb hbase experts, >> >> Looking at the HFile specs and have some questions. >> How is a particular table cell in a HBase table being represented in the >> HFile? Does the key of the key value pair represent the rowkey+column >> family:qualifier+timestamp and the value represent the corresponding cell >> value? If so, to read a row, multiple key/value pair reads have to be >> done? >> >> Thank you :) >> >> >> On Tue, Mar 22, 2011 at 9:09 AM, Weishung Chung >> wrote: >> >> > Thank you, I will definitely take a look. Also, the TFile spec below >> helps >> > me to understand more, >> > what an exciting work ! >> > >> > >> > >> https://issues.apache.org/jira/secure/attachment/12396286/TFile+Specification+20081217.pdf >> > >> > < >> https://issues.apache.org/jira/secure/attachment/12396286/TFile+Specification+20081217.pdf >> > >> > On Mon, Mar 21, 2011 at 11:41 AM, Doug Cutting >> wrote: >> > >> >> On 03/19/2011 09:01 AM, Weishung Chung wrote: >> >> > I am browsing through the hadoop.io package and was wondering what >> >> other >> >> > file formats are available in hadoop other than SequenceFile and >> TFile? >> >> > Is all data written through hadoop including those from hbase saved >> in >> >> the >> >> > above formats? It seems like SequenceFile is in key value pair >> format. >> >> >> >> Avro includes a file format that works with Hadoop. >> >> >> >> >> >> >> http://avro.apache.org/docs/current/api/java/org/apache/avro/mapred/package-summary.html >> >> >> >> Doug >> >> >> > >> > >> > > --0016364ef9946fc205049f14c7d3--