hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Gray <jg...@facebook.com>
Subject RE: Long vs String for qualifier
Date Mon, 21 Jun 2010 17:12:49 GMT
Can you describe your schema a bit more?  Could you use versioning instead of incrementing
IDs on the qualifiers?

Also, you could consider having a composite value, so id1_asLong would have a value that contained
both val1 and val5 in your example.  You could use any number of serialization strategies
(comma-separated, JSON, Thrift/protobuf, Writable, etc).

If you want them as two columns, I would recommend that things you want to retrieve together
be neighboring.  For example, you might make the qualifiers a composite type of <id_as_long><qf_type>,
so <id1_asLong><0byte> for the existing stuff and <id1_asLong><1byte>
for status?  That way they are stored sequentially so optimally efficient at read time.

JG

> -----Original Message-----
> From: N Kapshoo [mailto:nkapshoo@gmail.com]
> Sent: Monday, June 21, 2010 9:59 AM
> To: hbase-user@hadoop.apache.org
> Subject: Long vs String for qualifier
> 
> I have a 'long' number that I get by using
> HTable.'incrementColumnValue'. This long is used as the qualifier id
> on a columnFamily.
> 
> Now I need to add a prefix 'status' so that I can store another value
> in the same family.
> 
> How should I consider String vs long sorting?
> 
> So right now:
> 
> colFamily: id1_asLong = val1
> colFamily: id2_asLong = val2
> colFamily: id3_asLong = val3
> colFamily: id4_asLong = val4
> 
> and in addition
> 
> colFamily: status_id1_asString = val5
> colFamily: status_id2_asString = val6
> colFamily: status_id3_asString = val7
> colFamily: status_id4_asString = val8
> 
> To make sure that 'id' values are sorted and accessed sequentially,
> should I change my design so that the id1_asLong is stored as
> id1_asString?
> When I do my Get, I always get id1_asLong and status_id1_asString
> together.
> 
> Thanks.

Mime
View raw message