Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@cassandra.apache.org
Received-SPF: pass (athena.apache.org: domain of tholzer@wetafx.co.nz
 designates 110.232.144.26 as permitted sender)
Message-ID: <4E4AE839.2010403@wetafx.co.nz>
Date: Wed, 17 Aug 2011 09:59:21 +1200
From: Teijo Holzer <tholzer@wetafx.co.nz>
User-Agent: Mozilla/5.0 (X11; Linux x86_64;
 rv:5.0) Gecko/20110705 Thunderbird/5.0
MIME-Version: 1.0
To: user@cassandra.apache.org
CC: Yan Chunlu <springrider@gmail.com>
Subject: Re: node restart taking too long
References: 
 <CAOA66tEiz5G2Jdf_wu+b5nizd=g+Cu9GjS9KtbVwG9uD6MzEZg@mail.gmail.com>
 <3066FEE2-CE8D-4B1D-BEB9-75812BAFE9F7@thelastpickle.com>
 <CALdd-ziBuvufOxkA1cHaZhqF68o1EOKiqNekGYkRPVPiZhTGGQ@mail.gmail.com>
 <CAOA66tHObEeS5LseVHctg6guQWpfGanJe1Dcic8ifGPvFGEBXA@mail.gmail.com>
 <CAOA66tFdpExyc1wBuk7Q6=dLPV=DwqKSYLH01pFvw+T6OVeSTg@mail.gmail.com>
In-Reply-To: 
 <CAOA66tFdpExyc1wBuk7Q6=dLPV=DwqKSYLH01pFvw+T6OVeSTg@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit

Hi,

yes, we saw exactly the same messages. We got rid of these by doing the following:

* Set all row & key caches in your CFs to 0 via cassandra-cli
* Kill Cassandra
* Remove all files in the saved_caches directory
* Start Cassandra
* Slowly bring back row & key caches (if desired, we left them off)

Cheers,

	T.

On 16/08/11 23:35, Yan Chunlu wrote:
>   I saw alot slicequeryfilter things if changed the log level to DEBUG.  just
> thought even bring up a new node will be faster than start the old one..... it
> is wired
>
> DEBUG [main] 2011-08-16 06:32:49,213 SliceQueryFilter.java (line 123)
> collecting 0 of 2147483647: 76616c7565:false:225@1313068845474382
> DEBUG [main] 2011-08-16 06:32:49,245 SliceQueryFilter.java (line 123)
> collecting 0 of 2147483647: 76616c7565:false:453@1310999270198313
> DEBUG [main] 2011-08-16 06:32:49,251 SliceQueryFilter.java (line 123)
> collecting 0 of 2147483647: 76616c7565:false:26@1313199902088827
> DEBUG [main] 2011-08-16 06:32:49,576 SliceQueryFilter.java (line 123)
> collecting 0 of 2147483647: 76616c7565:false:157@1313097239332314
> DEBUG [main] 2011-08-16 06:32:50,674 SliceQueryFilter.java (line 123)
> collecting 0 of 2147483647: 76616c7565:false:41729@1313190821826229
> DEBUG [main] 2011-08-16 06:32:50,811 SliceQueryFilter.java (line 123)
> collecting 0 of 2147483647: 76616c7565:false:6@1313174157301203
> DEBUG [main] 2011-08-16 06:32:50,867 SliceQueryFilter.java (line 123)
> collecting 0 of 2147483647: 76616c7565:false:98@1312011362250907
> DEBUG [main] 2011-08-16 06:32:50,881 SliceQueryFilter.java (line 123)
> collecting 0 of 2147483647: 76616c7565:false:42@1313201711997005
> DEBUG [main] 2011-08-16 06:32:50,910 SliceQueryFilter.java (line 123)
> collecting 0 of 2147483647: 76616c7565:false:96@1312939986190155
> DEBUG [main] 2011-08-16 06:32:50,954 SliceQueryFilter.java (line 123)
> collecting 0 of 2147483647: 76616c7565:false:621@1313192538616112
>
>
>
> On Tue, Aug 16, 2011 at 7:32 PM, Yan Chunlu <springrider@gmail.com
> <mailto:springrider@gmail.com>> wrote:
>
>     but it seems the row cache is cluster wide, how will  the change of row
>     cache affect the read speed?
>
>
>     On Mon, Aug 15, 2011 at 7:33 AM, Jonathan Ellis <jbellis@gmail.com
>     <mailto:jbellis@gmail.com>> wrote:
>
>         Or leave row cache enabled but disable cache saving (and remove the
>         one already on disk).
>
>         On Sun, Aug 14, 2011 at 5:05 PM, aaron morton <aaron@thelastpickle.com
>         <mailto:aaron@thelastpickle.com>> wrote:
>          >  INFO [main] 2011-08-14 09:24:52,198 ColumnFamilyStore.java (line 547)
>          > completed loading (1744370 ms; 200000 keys) row cache for COMMENT
>          >
>          > It's taking 29 minutes to load 200,000 rows in the  row cache. Thats a
>          > pretty big row cache, I would suggest reducing or disabling it.
>          > Background
>         http://www.datastax.com/dev/blog/maximizing-cache-benefit-with-cassandra
>          >
>          > and server can not afford the load then crashed. after come back,
>         node 3 can
>          > not return for more than 96 hours
>          >
>          > Crashed how ?
>          > You may be seeing https://issues.apache.org/jira/browse/CASSANDRA-2280
>          > Watch nodetool compactionstats to see when the Merkle tree build
>         finishes
>          > and nodetool netstats to see which CF's are streaming.
>          > Cheers
>          > -----------------
>          > Aaron Morton
>          > Freelance Cassandra Developer
>          > @aaronmorton
>          > http://www.thelastpickle.com
>          > On 15 Aug 2011, at 04:23, Yan Chunlu wrote:
>          >
>          >
>          > I got 3 nodes and RF=3, when I repairing ndoe3, it seems alot data
>          > generated.  and server can not afford the load then crashed.
>          > after come back, node 3 can not return for more than 96 hours
>          >
>          > for 34GB data, the node 2 could restart and back online within 1 hour.
>          >
>          > I am not sure what's wrong with node3 and should I restart node 3 again?
>          > thanks!
>          >
>          > Address         Status State   Load            Owns    Token
>          >
>          > 113427455640312821154458202477256070484
>          > node1     Up     Normal  34.11 GB        33.33%  0
>          > node2     Up     Normal  31.44 GB        33.33%
>          > 56713727820156410577229101238628035242
>          > node3     Down   Normal  177.55 GB       33.33%
>          > 113427455640312821154458202477256070484
>          >
>          >
>          > the log shows it is still going on, not sure why it is so slow:
>          >
>          >
>          >  INFO [main] 2011-08-14 08:55:47,734 SSTableReader.java (line 154)
>         Opening
>          > /cassandra/data/COMMENT
>          >  INFO [main] 2011-08-14 08:55:47,828 ColumnFamilyStore.java (line 275)
>          > reading saved cache /cassandra/saved_caches/COMMENT-RowCache
>          >  INFO [main] 2011-08-14 09:24:52,198 ColumnFamilyStore.java (line 547)
>          > completed loading (1744370 ms; 200000 keys) row cache for COMMENT
>          >  INFO [main] 2011-08-14 09:24:52,299 ColumnFamilyStore.java (line 275)
>          > reading saved cache /cassandra/saved_caches/COMMENT-RowCache
>          >  INFO [CompactionExecutor:1] 2011-08-14 10:24:55,480
>         CacheWriter.java (line
>          > 96) Saved COMMENT-RowCache (200000 items) in 2535 ms
>          >
>          >
>          >
>          >
>          >
>          >
>
>
>
>         --
>         Jonathan Ellis
>         Project Chair, Apache Cassandra
>         co-founder of DataStax, the source for professional Cassandra support
>         http://www.datastax.com
>
>
>