hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From llpind <sonny_h...@hotmail.com>
Subject Re: Adding/Removing regionservers
Date Thu, 02 Jul 2009 19:14:15 GMT

Thanks for the tips.

Yeah that is the model we had before, the problem is we can potentially have
millions of IDs for a given TYPE|VAL. 

we are considering something like:
Row Key: TYPE|VALUE|ID
column: link:TYPE|VALUE

This is only because ID may never have more than a few TYPE|VAL results in
this current dataset, which would also eliminate the need to go to second
table.  

Thanks for the help.  


Jonathan Gray-2 wrote:
> 
> Well you're trying to do a join.  How much data is actually in TableB? 
> You might consider denormalizing so that you don't have to query TableB, 
> the data you need is already in TableA.
> 
> You could use a Get (single trip) for the inner loop rather than a 
> Scanner (which requires multiple round-trips).  You could even use a Get 
> for the outer loop by making your table wide instead of tall.
> 
> Row Key:  TYPE|VALUE
> Column: link:ID
> 
> And you have a column for each ID within that TYPE|VALUE row.
> 
> Also, don't forget to close your scanners if you do use scanners.
> 
> JG
> 
> 
> llpind wrote:
>> Assume a schema like so:  
>> 
>> TableA======================
>> Row Key:  TYPE|VALUE|ID
>> Column:  link:ID  (irrelevant)
>> TableB======================
>> Row Key: ID
>> Column: typeval:TYPE|VALUE
>> ===========================
>> 
>> 
>> 
>> I need to iterate over the TableA using a Scanner to get all IDs based on
>> TYPE|VALUE, then for each ID I need to get from TableB what TYPE|VALUE’s
>> it’s tied to (a many to many).
>> Assume I have a list of TYPE|VALUES in a List, and need to process
>> through
>> this data.  Done something like this:
>> 
>> 
>> 
>> for (String typeVal : list){
>> 
>>   Scan tblAScan = new Scan(Bytes.toBytes(typeVal  + “|”),
>> Bytes.toBytes(typeVal  + “|A”));	//give me all IDs for matching TYPE|VAL
>>   ResultScanner s1 = tblA.getScanner(tblAScan);
>> 
>>   for (Result tblBRowResult = s1.next(); tblBRowResult != null;
>> tblBRowResult = s1.next()){
>> 
>> 	  Scan tblBScan = new Scan(Bytes.toBytes(tblBRowResult.getValue() ),
>> Bytes.toBytes(typeVal  + “ ”));  //IDs are all numeric
>> 	  ResultScanner s2 = tblA.getScanner(tblAScan);
>> 	  List results = s2.next().list();  //only care about column data here,
>> since ID is row key
>> 
>> 	  for (KeyValue kv : results){
>> 			//do stuff
>> 			kv.getValue();
>> 	  }
>> 
>>   }
>> 
>> }
>> 
> 
> 

-- 
View this message in context: http://www.nabble.com/Adding-Removing-regionservers-tp24309642p24312474.html
Sent from the HBase User mailing list archive at Nabble.com.


Mime
View raw message