hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From stack <st...@duboce.net>
Subject Re: About HBase Files
Date Wed, 23 Sep 2009 05:48:49 GMT
On Tue, Sep 22, 2009 at 10:10 PM, stchu <stchu.cloud@gmail.com> wrote:

> Hi Stack and Erik,
>
> Thanks for your answers. I think the timestamp is also contain in mapfiles
> (in binary format?),
> am I right?
>
> Yes, its a serialized long.



> Hfile looks better. I will migrate my prog. to hadoop 0.20 and hbase 0.20
> after I finished my experiments in 0.19.
> But it needs some efforts for those imcompatible apis... :P
>
> Well, the old APIs are still in place, just deprecated, so hopefully you
shouldn't have to migrate anything.

Go easy,
St.Ack




> stchu
>
>
> 2009/9/23 stack <stack@duboce.net>
>
> > Yes, what Erik said.  MapFile is a binary format.   What you are some
> > preamble up front listing the key and value class types plus some
> > miscellaneous meta data.  Then, per key and value, these are serialized
> > Writable types.
> >
> > Move to hbase 0.20.0. It uses hfile instead of mapfile.  There is a nice
> > little utility that does a toString on the hfile binary serializations
> that
> > prints prettier than the below.
> >
> > St.Ack
> >
> >
> > On Tue, Sep 22, 2009 at 3:10 AM, stchu <stchu.cloud@gmail.com> wrote:
> >
> > > Hi,
> > >
> > > I use Hadoop 0.19.1 and HBase 0.19.3.
> > > I write a simple table which have 2 column families (Level0:trail_id,
> > > Level1:trail_id).
> > > And I put the data (4 rows) into hbase table:
> > > 120_25                      column=Level0:trail_id,
> > > timestamp=2009091613240001, value=39999;21234
> > > 121.1_23.4                  column=Level1:trail_id,
> > > timestamp=2009091613240001, value=50001;00048;111110
> > > 121.1_25.0                  column=Level1:trail_id,
> > > timestamp=2009091613240001, value=39999;21234
> > > 121_25                      column=Level0:trail_id,
> > > timestamp=2009091613240003, value=39999;21234;000001;000003
> > >
> > >
> > > I find the content of files in HDFS is:
> > >
> > > for the mapfile Level0:
> > > SEQ
> > >
> >
> !org.apache.hadoop.hbase.HStoreKey1org.apache.hadoop.hbase.io.ImmutableBytesWritable�������h
> > > =
> > > �p{9
> > > ��1������.���  120_25 Level0:trail_id� #B �����
39999;21234���<���
> >  121_25
> > > Level0:trail_id� #B ����� 39999;21234;000001;000003
> > >
> > > for the mapfile Level1:
> > > SEQ
> > >
> >
> !org.apache.hadoop.hbase.HStoreKey1org.apache.hadoop.hbase.io.ImmutableBytesWritable�������>T�
> > > �4�q-�� ��.���9���#
> > > 121.1_23.4 Level1:trail_id� #B ����� 50001;00048;111110���2���#
> > > 121.1_25.0 Level1:trail_id� #B ����� 39999;21234
> > >
> > >
> > > I wonder that what the messy code means? Is that "offset" and/or
> > > "timestamps"?
> > > Besides, since hbase store the mapfile depends on columnfamily, why we
> > need
> > > to save that (in this case: Level0 and Level1)?
> > >
> > > I appreciate your helps or guides.
> > >
> > > stchu
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message