incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Freeman, Tim" <tim.free...@hp.com>
Subject RE: Persistently increasing read latency
Date Thu, 03 Dec 2009 21:59:04 GMT
I'm trying to get all the questions answered at once here so people aren't seeing a ton of
emails from me.

>If you stop doing client requests does that number start to go down?

I stopped the client at 11:28.  There were 2306 files in data/Keyspace1.  It's now 12:44,
and there are 1826 files in data/Keyspace1.  As I wrote this email, the number increased to
1903, then to 1938 and 2015, even though the server has no clients.  I used jconsole to invoke
a few explicit garbage collections and the number went down to 811.

jconsole reports that the compaction pool has 1670 pending tasks.  As I wrote this email,
the number gradually increased to 1673.  The server has no clients, so this is odd.  The number
of completed tasks in the compaction pool has consistently been going up while the number
of pending tasks stays the same.  The number of completed tasks increased from 130 to 136.

> If not there is a bug.  If there are exceptions in the logs then it is
> an obvious one otherwise something more subtle.

The output from "grep -v INFO log*" is reasonably concise:
 
log.2009-12-02-19: WARN [Timer-0] 2009-12-02 19:55:23,305 LoadDisseminator.java (line 44)
Exception was generated at : 12/02/2009 19:55:22 on thread Timer-0
log.2009-12-02-19:
log.2009-12-02-19:java.lang.NullPointerException
log.2009-12-02-19:      at org.apache.cassandra.utils.FileUtils.getUsedDiskSpaceForPath(FileUtils.java:198)
log.2009-12-02-19:      at org.apache.cassandra.utils.FileUtils.getUsedDiskSpaceForPath(FileUtils.java:200)
log.2009-12-02-19:      at org.apache.cassandra.utils.FileUtils.getUsedDiskSpaceForPath(FileUtils.java:200)
log.2009-12-02-19:      at org.apache.cassandra.utils.FileUtils.getUsedDiskSpace(FileUtils.java:211)
log.2009-12-02-19:      at org.apache.cassandra.service.StorageService.getLoad(StorageService.java:510)
log.2009-12-02-19:      at org.apache.cassandra.service.LoadDisseminator.run(LoadDisseminator.java:37)
log.2009-12-02-19:      at java.util.TimerThread.mainLoop(Timer.java:512)
log.2009-12-02-19:      at java.util.TimerThread.run(Timer.java:462)
log.2009-12-02-19:
log.2009-12-02-23: WARN [Timer-0] 2009-12-02 23:10:27,128 LoadDisseminator.java (line 44)
Exception was generated at : 12/02/2009 23:10:27 on thread Timer-0
log.2009-12-02-23:
log.2009-12-02-23:java.lang.NullPointerException
log.2009-12-02-23:      at org.apache.cassandra.utils.FileUtils.getUsedDiskSpaceForPath(FileUtils.java:198)
log.2009-12-02-23:      at org.apache.cassandra.utils.FileUtils.getUsedDiskSpaceForPath(FileUtils.java:200)
log.2009-12-02-23:      at org.apache.cassandra.utils.FileUtils.getUsedDiskSpaceForPath(FileUtils.java:200)
log.2009-12-02-23:      at org.apache.cassandra.utils.FileUtils.getUsedDiskSpace(FileUtils.java:211)
log.2009-12-02-23:      at org.apache.cassandra.service.StorageService.getLoad(StorageService.java:510)
log.2009-12-02-23:      at org.apache.cassandra.service.LoadDisseminator.run(LoadDisseminator.java:37)
log.2009-12-02-23:      at java.util.TimerThread.mainLoop(Timer.java:512)
log.2009-12-02-23:      at java.util.TimerThread.run(Timer.java:462)
log.2009-12-02-23:

Recall that these line numbers should match the 29 Nov build.

Chris Goffinet [goffinet@digg.com] said:
>After you stop the test. Do you see the pending tasks for compaction  
>drop?

I see chatter in the log files saying that compactions are happening, but the number of reported
pending tasks has increased while I was watching.

Chris Goffinet [goffinet@digg.com] said:
>[Were] there any other numbers in tpstats from nodeprobe that are growing?

I didn't ask this question before stopping my client.  Now tpstats reports that there are
no pending tasks other than 1672 COMPACTION-POOL tasks, and the only active tasks are one
COMMITLOG and one COMPACTION-POOL.

>Can you plot the number of SSTables? 

I suppose so.  Do you mean the number of files, or one of the stats output from "nodeprobe
tpstats"?  Keep in mind that the number of files is influenced by the garbage collector, so
I doubt that counting the files will tell us much.

>Are you using the standard storage-conf.xml defaults?

Pretty much.  I attached the file to this email.  My summary of the changes from 1 Dec 09
is still valid even though I changed Cassandra versions:

>Briefly, it has default values, except RpcTimeoutInMillis is 30000 and the partitioner
is OrderPreservingPartitioner.

Tim Freeman
Email: tim.freeman@hp.com
Desk in Palo Alto: (650) 857-2581
Home: (408) 774-1298
Cell: (408) 348-7536 (No reception business hours Monday, Tuesday, and Thursday; call my desk
instead.)


-----Original Message-----
From: Jonathan Ellis [mailto:jbellis@gmail.com] 
Sent: Thursday, December 03, 2009 11:14 AM
To: cassandra-user@incubator.apache.org
Subject: Re: Persistently increasing read latency

On Thu, Dec 3, 2009 at 1:11 PM, Freeman, Tim <tim.freeman@hp.com> wrote:
>>how many are in your data directories?  is your compaction
>>lagging 1000s of tables behind again?
>
> Yes, there are 2348 files in data/Keyspace1, and jconsole says the compaction pool has
>1600 pending tasks.

If you stop doing client requests does that number start to go down?

If not there is a bug.  If there are exceptions in the logs then it is
an obvious one otherwise something more subtle. :)

-Jonathan

Mime
View raw message