jakarta-jcs-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matthew Cooke <mpcoo...@lineone.net>
Subject Re: Moving forward with JCS
Date Mon, 05 Apr 2004 19:07:40 GMT
Travis,

I spent all day setting up testing and staging the servers in a 
RemoteCache configuration (removing all usage of LTCP cache) and using 
your code.

*Good news* - so far it is working with no problems!

This is a massive relief as we've had two people waste almost a week 
testing variations of the LTCP configuration.

With regards to unit tests. I had no luck writing junit tests for LTCP 
as I couldn't get JCS to work with two LTCP caches configured on the 
same machine (to each other). Perhaps I would need to configure it with 
two separate JCS instances running.
So far it has only been possible to reproduce after restarting some 
machines and only whilst under high get load in a production enviroment 
- typical! But once it starts it is a repeatable problem.

At the time of the LTCP problems I was a lot more stressed out about it 
than I am now. But still, I would strongly advice against anyone using 
LTCP auxiliary in a production environment. It appeared to work to start 
with then started serving up commercially sensitive data to the wrong 
clients after a day of running. It was a very bad day for us!

Thanks for the work Travis, and if you want I can try to setup the Junit 
tests to reproduce though I don't really have much time so i won't be 
able to fix it.

Fingers crossed the remote cache solution runs smoothly on the current 3 
servers and I will roll it out across the others.

Cheers,
Matt.


Travis Savo wrote:

>You've exposed my one weak area: I've only ever used RemoteCache, and
>therefore have only fixed that. Lateral Cache remains, as far as I have
>reached, untested (note my earlier comment about needing tests for it).
>
>You may wish to consider changing your configuration to a remote cache which
>I've had much more luck (and experience) with, and should provide you with
>similar functionality (stated very loosely: object replication across
>caches) at a slightly higher memory/process overhead but reduced network
>bandwidth utilization (assuming a reasonable cache hit/miss ratio; if your
>doing a ton of misses and few hits the remote caches offers no savings over
>lateral, ultimately requiring more bandwidth and memory). This at least has
>been tried extensively across reboots of both clients and servers without
>incident.
>
>Do you have a unit test which will reproduce the problem? If I had some way
>to reproduce the problem easily I could probably find the time to run it
>through my debugger a time or 10.
>
>-Travis Savo
>
>
>-----Original Message-----
>From: Matthew Cooke [mailto:mpcooke3@lineone.net]
>Sent: Saturday, April 03, 2004 12:41 PM
>To: Turbine JCS Users List
>Subject: Re: Moving forward with JCS
>
>
>I've updated to Travis's new code and we are still start getting wrong 
>values being returned for keys on some machines. It seems the problem 
>only occurs after restarting some of the machines in the cluster.
>
>The configuration we are using is 1 "builder" machine just doing put's 
>and 2 serving machines just doing (a lot of) "get's" the builder is 
>configured with 2 LTCP caches one to each serving machines. Each serving 
>machines also has an LTCP cache connecting it to the builder.
>
>The problem only seems to occur when you restart some of the machines. 
>Then it appears somehow something get's out of sync in JCS and I reckon 
>sometimes when you ask for an object for a certain "Put Key" you receive 
>the value object for the previous get() request.
>
>The Builder machine is configured with auxiliary caches DC, 
>LTCP_Server1, LTCP_Server2
>and each serving machine is configured with auxiliary caches DC, 
>LTCP_Builder
>
>This configuration does appear to work without error until you restart 
>machines.
>
>Firstly, does anyone know what could be causing this? Secondly does 
>anyone else use the LTCP auxiliary with restarts under a heavy load?
>
>Regards,
>Matthew Cooke.
>
>
>Travis Savo wrote:
>
>  
>
>>Just a little brainstorming... feel free to correct me if I'm off track
>>here.
>>
>>Documents that are in desperate need of writing:
>>Using Remote Cache: Step by step guide for using Remote Cache (because it's
>>harder than it looks!), how it works, when to use it, and when it won't do
>>what you expect it to. (I've already started on this one and expect to have
>>it soon)
>>Using Lateral Cache: Step by step guide for using Lateral Cache, how it
>>works, and when to use it.
>>
>>What's still needed before I'd consider this 'stable and mature':
>>Get my patches for LRU, Remote, Disk cache, and stastics gathering back
>>    
>>
>into
>  
>
>>CVS.
>>Unit tests to test expiration, idle, and max objects with precision, and
>>accompanying cache.ccf configurations.
>>Unit tests for Remote Cache (tricky, but doable).
>>Unit tests for Lateral Cache.
>>Unit tests for the Read/Write locking semantics.
>>More people to run the unit tests, and try it in new and interesting ways.
>>(Please help!)
>>Optional Disk Persistence over reboots via cache.ccf configuration (see
>>prior posts for more info).
>>
>>Moving forward:
>>The CacheEventQueues (and everywhere else this is done) should probably
>>    
>>
>stop
>  
>
>>using hand-rolled linked lists in favor of an FIFO buffer implementation a
>>little more standard and less bug prone (commons-collections perhaps?).
>>    
>>
>This
>  
>
>>should help to clean up the code a bit.
>>
>>Comments? Questions? Screams of agony?
>>
>>If no one has any objections I'll be posting these things as I complete
>>them, but anyone can feel free to jump in and tackle one of these, or
>>suggest some others, or say why what I'm suggesting is a Bad Thing(tm).
>>
>>-Travis Savo
>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail: turbine-jcs-user-unsubscribe@jakarta.apache.org
>>For additional commands, e-mail: turbine-jcs-user-help@jakarta.apache.org
>>
>> 
>>
>>    
>>
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: turbine-jcs-user-unsubscribe@jakarta.apache.org
>For additional commands, e-mail: turbine-jcs-user-help@jakarta.apache.org
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: turbine-jcs-user-unsubscribe@jakarta.apache.org
>For additional commands, e-mail: turbine-jcs-user-help@jakarta.apache.org
>
>  
>


---------------------------------------------------------------------
To unsubscribe, e-mail: turbine-jcs-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: turbine-jcs-user-help@jakarta.apache.org


Mime
View raw message