hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From llpind <sonny_h...@hotmail.com>
Subject Re: Frequent changing rowkey - HBase insert
Date Mon, 08 Jun 2009 17:22:48 GMT

Hi Erik,

Yes that sounds good.  The type of calls I was looking for in the API.  


Erik Holstad wrote:
> 
> Hi Ilpind!
> 
> On Mon, Jun 8, 2009 at 8:45 AM, llpind <sonny_heer@hotmail.com> wrote:
> 
>>
>> The insert works well for when I have a row key which is constant for a
>> long
>> period of time, and I can split it up into blocks.  But when the row key
>> changes often, then insert performance over time starts to suffer.  The
>> suggestion made by Ryan does help, and I was eventually able to get the
>> entire data set into HBase. ( ~120 Million records)
>>
>> Currently working on some analysis, and had a question about the java
>> api.
>> Is there a way to get record count given a row key?  something like: long
>> getColumnCount (rowkey).  So it doesn't bring down any data to client,
>> but
>> simply returns the size..?
>>
>>
> We have been talking about something similar to this for scanners. A call
> that
> just counts the number of rows between a start and a stop row and doesn't
> return
> any data.
> So that would make it 4 calls if I'm not mistaken :
> countRows(Scan scan)
> countFamilies(List<byte[]> families)
> countQualifiers(byte [] family)
> countVersions(byte[] family, byte[] qualifier, long minTime, long maxTime)
> 
> or maybe just keep it simple and use.:
> countRows(Scan scan)
> countFamilies(Get get)
> countQualifiers(Get get)
> countVersions(Get get)
> 
> We talked about just having a special serializer that doesn't return any
> data just the count.
> 
> How does that sound to you?
> 
> Erik
> 
> 

-- 
View this message in context: http://www.nabble.com/Frequent-changing-rowkey---HBase-insert-tp23906724p23928539.html
Sent from the HBase User mailing list archive at Nabble.com.


Mime
View raw message