incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Ellis <jbel...@gmail.com>
Subject Re: Cassandra not suitable?
Date Sat, 10 Dec 2011 01:00:01 GMT
Sounds like you're simply throwing more seq scans at it via m/r than
your disk can handle.  iostat could confirm that disk is the
bottleneck.  But "real" monitoring would be better.
http://www.datastax.com/products/opscenter

On Thu, Dec 8, 2011 at 1:02 AM, Patrik Modesto <patrik.modesto@gmail.com> wrote:
> Hi Jake,
>
> I see the timeouts in mappers as well as at random-access backend
> daemons (for web services). There are now 10 mappers, 2 reducers on
> each node. There is one big 4-disk raid10 array on each node on which
> there is cassandra together with HDFS. We store just few GB of files
> on HDFS, otherwise we don't use it.
>
> Regards,
> P.
>
> On Wed, Dec 7, 2011 at 15:33, Jake Luciani <jakers@gmail.com> wrote:
>> Where do you see the timeout exceptions? in the mappers?
>>
>> How many mappers reducers slots are you using?  What does your disk setup
>> look like? do you have HDFS on same disk as cassandra data dir?
>>
>> -Jake
>>
>>
>> On Tue, Dec 6, 2011 at 4:50 AM, Patrik Modesto <patrik.modesto@gmail.com>
>> wrote:
>>>
>>> Hi,
>>>
>>> I'm quite desperate about Cassandra's performance in our production
>>> cluster. We have 8 real-HW nodes, 32core CPU, 32GB memory, 4 disks in
>>> raid10, cassandra 0.8.8, RF=3 and Hadoop.
>>> We four keyspaces, one is the large one, it has 2 CFs, one is kind of
>>> index, the other holds data. There are about 7milinon rows, mean row
>>> size is 7kB. We run several mapreduce tasks, most of them just reads
>>> from cassandra and writes to hdfs, but one fetch rows from cassnadra,
>>> compute something and write it back, for each row we compute three new
>>> json values, about 1kB each (they get overwritten next round).
>>>
>>> We got lots and lots of Timeout exceptions, LiveSSTablesCount over
>>> 100. Reapir doesn't finish even in 24hours, reading from the other
>>> keyspaces timeouts as well.  We set compaction_throughput_mb_per_sec:
>>> 0 but it didn't help.
>>>
>>> Did we choose wrong DB for our usecase?
>>>
>>> Regards,
>>> Patrik
>>>
>>> This is from one node:
>>>
>>>  INFO 10:28:40,035 Pool Name                    Active   Pending
  Blocked
>>>  INFO 10:28:40,036 ReadStage                        96      
695         0
>>>  INFO 10:28:40,037 RequestResponseStage              0         0
        0
>>>  INFO 10:28:40,037 ReadRepairStage                   0        
0         0
>>>  INFO 10:28:40,037 MutationStage                     1        
1         0
>>>  INFO 10:28:40,038 ReplicateOnWriteStage             0         0  
      0
>>>  INFO 10:28:40,038 GossipStage                       0        
0         0
>>>  INFO 10:28:40,038 AntiEntropyStage                  0        
0         0
>>>  INFO 10:28:40,039 MigrationStage                    0        
0         0
>>>  INFO 10:28:40,039 StreamStage                       0        
0         0
>>>  INFO 10:28:40,040 MemtablePostFlusher               0         0
        0
>>>  INFO 10:28:40,040 FlushWriter                       0        
0         0
>>>  INFO 10:28:40,040 MiscStage                         0      
  0         0
>>>  INFO 10:28:40,041 FlushSorter                       0        
0         0
>>>  INFO 10:28:40,041 InternalResponseStage             0         0  
      0
>>>  INFO 10:28:40,041 HintedHandoff                     1        
5         0
>>>  INFO 10:28:40,042 CompactionManager               n/a        27
>>>  INFO 10:28:40,042 MessagingService                n/a   0,16559
>>>
>>> And here is the nodetool ring  output:
>>>
>>> 10.2.54.91      NG          RAC1        Up     Normal  118.04
GB
>>> 12.50%  0
>>> 10.2.54.92      NG          RAC1        Up     Normal  102.74
GB
>>> 12.50%  21267647932558653966460912964485513216
>>> 10.2.54.93      NG          RAC1        Up     Normal  76.95 GB
>>> 12.50%  42535295865117307932921825928971026432
>>> 10.2.54.94      NG          RAC1        Up     Normal  56.97 GB
>>> 12.50%  63802943797675961899382738893456539648
>>> 10.2.54.95      NG          RAC1        Up     Normal  75.55 GB
>>> 12.50%  85070591730234615865843651857942052864
>>> 10.2.54.96      NG          RAC1        Up     Normal  102.57
GB
>>> 12.50%  106338239662793269832304564822427566080
>>> 10.2.54.97      NG          RAC1        Up     Normal  68.03 GB
>>> 12.50%  127605887595351923798765477786913079296
>>> 10.2.54.98      NG          RAC1        Up     Normal  194.6 GB
>>> 12.50%  148873535527910577765226390751398592512
>>
>>
>>
>>
>> --
>> http://twitter.com/tjake



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com

Mime
View raw message