hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Yang <eric...@gmail.com>
Subject Re: stargate retrieve multiple version of a cell
Date Sun, 04 Jul 2010 05:30:21 GMT
This is exactly what I need.  Thanks, owe you a beer. :)

regards,
Eric

On Sat, Jul 3, 2010 at 9:34 PM, Jonathan Gray <jgray@facebook.com> wrote:
> You should reuse HTable instances but they are not thread-safe so use one per thread.
 Check out the HTablePool class.
>
>> -----Original Message-----
>> From: Eric Yang [mailto:eric818@gmail.com]
>> Sent: Saturday, July 03, 2010 9:30 PM
>> To: user@hbase.apache.org
>> Subject: Re: stargate retrieve multiple version of a cell
>>
>> I used the shell to create the table.  This explained why it only
>> stored 3 versions.  I will switch to use java API to create the
>> tables.  Another question, I am currently sinking all data into the
>> same table for my prototype.  Is there any heavy cost for creating new
>> instance of HTable?
>>
>> My code may looks like this:
>>
>> for(String tableName : tableList) {
>>   List<PUT> list = ...;
>>   hbase = new HTable(new HBaseConfiguration(), tableName);
>>   hbase.put(list);
>> }
>>
>> Or should I keep HTable instances in hash and reuse them later?
>>
>> regards,
>> Eric
>>
>> On Sat, Jul 3, 2010 at 5:43 PM, Jonathan Gray <jgray@facebook.com>
>> wrote:
>> > Have you looked at Scan.setMaxVersions(int)?  Is that what you're
>> looking for?
>> >
>> > Also, when you created the table, it has a default max of three
>> versions.  Did you use the java API or the shell to create your table?
>> >
>> > HColumnDescriptor.setMaxVersions(int) is what you want to set when
>> you create the table initially.  To keep all versions, use
>> setMaxVersions(Integer.MAX_VALUE).
>> >
>> > JG
>> >
>> >> -----Original Message-----
>> >> From: Eric Yang [mailto:eric818@gmail.com]
>> >> Sent: Saturday, July 03, 2010 4:19 PM
>> >> To: user@hbase.apache.org
>> >> Subject: Re: stargate retrieve multiple version of a cell
>> >>
>> >> Hi Jonathan,
>> >>
>> >> I am trying to store large time series data.  I am using a row as a
>> >> group for one hour's data.  My row contains 60 timestamps, and each
>> >> timestamp has various cell values.  I am hoping this will produce
>> row
>> >> that is not  too thick and table that is slightly shorter.  I am
>> fine
>> >> with none ordered versioning, as long as I get timestamp when data
>> is
>> >> retrieved for the timestamp range.  When I scan for the cell, I only
>> >> get the most recent three versions of the cell.
>> >>
>> >> This was tested on hbase 0.20.5, and hadoop 0.20.2.
>> >>
>> >> regards,
>> >> Eric
>> >>
>> >>
>> >>
>> >> On Sat, Jul 3, 2010 at 2:34 PM, Jonathan Gray <jgray@facebook.com>
>> >> wrote:
>> >> > What exactly are you trying to do with the timestamp?  Currently
>> even
>> >> duplicates are retained and returned, but the order is not
>> guaranteed
>> >> (though we are working on this).
>> >> >
>> >> > The behavior is related only to time/order of operations, no
>> >> difference if using different clients (not including behavior from
>> >> write buffering).
>> >> >
>> >> > JG
>> >> >
>> >> >> -----Original Message-----
>> >> >> From: Eric Yang [mailto:eric818@gmail.com]
>> >> >> Sent: Saturday, July 03, 2010 2:32 PM
>> >> >> To: user@hbase.apache.org
>> >> >> Subject: Re: stargate retrieve multiple version of a cell
>> >> >>
>> >> >> I think I just found the answer of my own question.  It was not
>> >> >> stargte's problem.  The data was not stored in hbase as I
>> expected
>> >> it
>> >> >> to be.  This raised a more basic question:
>> >> >>
>> >> >> I am storing data like this:
>> >> >>
>> >> >> Put row1, cf1:c1: 0, timestamp: 10
>> >> >> Put row1, cf1:c2: 10, timestamp: 10
>> >> >> Put row1, cf1:c2: 15, timestamp: 20
>> >> >> Put row1, cf1:c1: 1, timestamp: 20
>> >> >>
>> >> >> I am updating individual column by timestamp, and repeat repeat
>> this
>> >> >> 60 times for each of the columns.  This is all executed by the
>> same
>> >> >> client.  When I scan for "row1, c2", would I get 60 different
>> values
>> >> >> for each of the timestamp?
>> >> >>
>> >> >> What would happen if this kind of updates are applied by
>> different
>> >> >> hbase client?
>> >> >>
>> >> >> regards,
>> >> >> Eric
>> >> >>
>> >> >> On Sat, Jul 3, 2010 at 1:56 PM, Eric Yang <eric818@gmail.com>
>> wrote:
>> >> >> > Hi all,
>> >> >> >
>> >> >> > I am trying to use stargate to get multiple versions of the
>> cell,
>> >> and
>> >> >> > my query looks like this:
>> >> >> >
>> >> >> > http://localhost:9090/chukwa/1278180000000-Eric-Yangs-
>> >> >>
>> >>
>> iMac.local/Hadoop_dfs_namenode:CreateFileOps/1278183540000/127818990000
>> >> >> 0
>> >> >> >
>> >> >> > table name: chukwa
>> >> >> > row: 1278187200000-Eric-Yangs-iMac.local
>> >> >> > column: Hadoop_dfs_namenode:CreateFileOps
>> >> >> > start-timestamp: 1278183540000
>> >> >> > end-timestamp: 1278189900000
>> >> >> >
>> >> >> > It only shows me the most recent 3 versions, but not all the
>> >> versions
>> >> >> > in this time range.  Is this the right syntax?  What am
I doing
>> >> >> wrong?
>> >> >> > Thanks
>> >> >> >
>> >> >> > regards,
>> >> >> > Eric
>> >> >> >
>> >> >
>> >
>

Mime
View raw message