incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Meler Wojciech <WMe...@wp-sa.pl>
Subject RE: Finding big rows
Date Wed, 11 May 2011 08:18:43 GMT
Thanks for reply. My app uses 7-bit ascii string row keys so I assume that they could be directly
used.

I'd like to fetch whole row. I was able to dump the big row with sstable2json, but both my
app and cli is unable to read the row from cassandra.
I see in json dump that all columns are marked as "deletedAt": -9223372036854775808, so SuperColumn::isMarkedForDelete()
should return false. My cluster is running cassandra 0.7.4 and it path was 0.7.0->0.7.2->0.7.3->0.7.4.
What's wrong? Bloom filters seems to be OK - I couldn't find tool for reading them but attached
program does the job.
I'm sure that both my app and cli refer to proper keys this big rows is getting bigger and
bigger as my app appends new super- and sub-columns to it, but can't read it:
get mycf[utf8('my-key')];
Returned 0 results.
I'm really confused - tried to turn debug on, but I can't see anything interesting in it.
Any ideas what to check next?


Regards,
Wojtek

From: aaron morton [mailto:aaron@thelastpickle.com]
Sent: Wednesday, May 11, 2011 12:29 AM
To: user@cassandra.apache.org
Subject: Re: Finding big rows

I'm not aware of anything to find the row sizes, and your code looks like a good approach.
Converting the key bytes to a string only makes sense if your app is doing the same thing.

In the cli try using one of the data type functions to format the key the same way as your
app is, e.g. get FooCF[utf8('my-key')]

The main limitation on Super Columns is that Sub columns are not indexed http://wiki.apache.org/cassandra/CassandraLimitations.
If you have a huge row use the get_slice() api call to get back slices of columns. The cli
does not support slicing columns.

Hope that helps.
-----------------
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com

On 10 May 2011, at 20:41, Meler Wojciech wrote:


Hello,

I've noticed very nice stats exposed with JMX. I was quite shocked when I saw that MaxRowSize
was about 400MB (it was expected to be several MB).
What is the best way to find keys of such big rows?

I couldn't find anything so I've written simple program to dump sizes from Index files (see
attachment),
and got the keys, but when I used cassandra-cli to get such columns it said "Returned 0 results.".
I've realised that my app creates such big rows because it can't read them from Cassandra
and recreates them every time.

Are there any tuneable limits for getting whole row?  Any limits on supercolumns?

Regards,
Wojtek


"WIRTUALNA POLSKA" Spolka Akcyjna z siedziba w Gdansku przy ul. Traugutta 115 C, wpisana do
Krajowego Rejestru Sadowego - Rejestru Przedsiebiorcow prowadzonego przez Sad Rejonowy Gdansk
- Polnoc w Gdansku pod numerem KRS 0000068548, o kapitale zakladowym 67.980.024,00 zlotych
oplaconym w calosci oraz Numerze Identyfikacji Podatkowej 957-07-51-216.
<IdxDump.java>




"WIRTUALNA POLSKA" Spolka Akcyjna z siedziba w Gdansku przy ul. Traugutta 115 C, wpisana do
Krajowego Rejestru Sadowego - Rejestru Przedsiebiorcow prowadzonego przez Sad Rejonowy Gdansk
- Polnoc w Gdansku pod numerem KRS 0000068548, o kapitale zakladowym 67.980.024,00 zlotych
oplaconym w calosci oraz Numerze Identyfikacji Podatkowej 957-07-51-216.

Mime
View raw message