cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark <static.void....@gmail.com>
Subject Re: SV: SV: Help with getting Key range with some column limitations
Date Fri, 20 Aug 2010 14:41:37 GMT
  On 8/20/10 1:05 AM, Thorvaldsson Justus wrote:
>
> I think you should try to do it some other way than iterate, it sounds 
> super suboptimal to me. Also the plugin option he was thinking of I 
> think is changing Cassandra sourcecode, kind of hard when Cassandra is 
> changing so fast but very possible. I think you should look at 
> http://blip.tv/file/4015273 and perhaps my blog post about the same 
> thing at www.Justus.st <http://www.Justus.st> Cassandra post 4 more on 
> Data model
>
> Exampel code in java, start and end key, next iteration the end should 
> be the last key that you collected, depends how you made you model
>
> //Keyrange is for what row key, you can specify what row 
> startkey,endkey and how many rows
>
> KeyRange keyRange = new KeyRange(700);
>
> keyRange.setStart_key(rowId);
>
> keyRange.setEnd_key(rowId);
>
>  //Specify what supercolumns you want to get
>
> SliceRange sliceRange = new SliceRange();
>
> sliceRange.setStart(new byte[] {});
>
> sliceRange.setFinish(new byte[] {});
>
> /J
>
> *Från:* Jone Lura [mailto:jone.lura@ecc.no]
> *Skickat:* den 20 augusti 2010 08:53
> *Till:* user@cassandra.apache.org
> *Ämne:* Re: SV: Help with getting Key range with some column limitations
>
> Thanks for you suggestions.
>
> I tried to iterate them, however I could not get it to work (pretty 
> sure its my code). Im still not to familiar with Cassandra, so could 
> you provide a small example?
>
> The key count could be up to atleast 20k and maybe more, and users 
> should not wait for more than 10 seconds for their map, so I also want 
> to investigate the plugin suggestion. Does the plugin exist? or do I 
> have to develop it myself? Are there any documentation on plugin 
> development for Cassandra?
>
> Best regards
>
> Jone
>
>
> On 19/08/2010 08:42, Thorvaldsson Justus wrote:
>
> You should iterate through them, get 200 then go get the next 200 and 
> so on.
>
> Also if checking a bounding box to another.. perhaps try sorting them 
> so you could start looking at both ends, perhaps make the iteration 
> smaller until match somehow?
>
> Just my simple coins, also upgrading will probably be needed to 
> iterate through RP because of bugs. But that should be simple enough 
> to 6.4
>
> /Justus
>
> *Från:* Jone Lura [mailto:jone.lura@ecc.no]
> *Skickat:* den 18 augusti 2010 20:32
> *Till:* user@cassandra.apache.org <mailto:user@cassandra.apache.org>
> *Ämne:* Help with getting Key range with some column limitations
>
> Hi,
>
> We are trying to implement Cassandra to replace one of our biggest SQL 
> tables, and so far we got it working.
>
> However, for testing I'm using Cassandra 0.6.2, Java and Pelops. 
> (Pelops not that important for my question) and need suggestions on 
> how to solve a problem retrieving a key range based on the following.
>
> <Keyspace Name="AIS">
>
> <ColumnFamily Name="Location"
>
>         ColumnType="Super"
>
>         CompareWith="LongType"
>
>         KeysCached="100%"
>
>         CompareSubcolumnsWith="UTF8Type" />
>
>      ...
>
> </Keyspace>
>
> The super column got columns for longitude and latitude.
>
>  1. Need to get get max long number for key
>
>  2. The key should also have supercolumns latitude and longitude 
> column intersecting (inside) with a given boundingbox.
>
> Currently I'm doing like this
>
>         KeyRange keyRange = new KeyRange();
>
>         keyRange.setStart_key("");
>
>         keyRange.setEnd_key("");
>
>         keyRange.setCount(700);
>
> And checking every row in db if it match my bounding box.
>
> But there are a lot more than 700 keys.. and if i set a higher count, 
> the get_range_slice get a Timeout Exception.
>
> Any ideas?
>
> Best Regards
>
> Jone
>
In regards to Cassandra #3..

"Increasing the memtable thresholds so that you create less sstables, 
but larger ones, is also a good idea. The defaults are small so 
Cassandra can work on a 1GB heap which is much smaller than most 
production ones. Reasonable rule of thumb: if you have a heap of N GB, 
increase both the throughput and count thresholds by N times."

What throughput and count thresholds is he referring to? There are 
multiple throughput options and I am not sure what the count threshold is.

binary_memtable_throughput_in_mb
memtable_throughput_in_mb
memtable_operations_in_millions (count?)


Mime
View raw message