jakarta-jcs-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Rosenbaum <drdan321-nos...@yahoo.com>
Subject Re: JCS Setup Questions
Date Mon, 07 Mar 2005 03:05:08 GMT
> The standard option is that every queue has
> its own devoted worker thread, which dies after 60
> seconds of inactivity and is started again when
> something gets put in the queue.  While alive it tries
> tog et stuff out of the queue.  If you put it in, it
> will be pulled out as soon as possible.  

So if I understand correctly, there is a possibility of serving
raw data on server B if the invalidate event on server A is
stuck in the queue.  Is this correct?

I have a side concern specific to using Hibernate with
distributed caching, particularly when caching collections.  I
can best describe my concern with a (somewhat contrived)
example.

Say you have a Department object, along with Employee objects,
with a Department.departmentEmployees collection, and there are
600 employees is a department.  In hibernate, each employee
would have its own object, and the
Department.departmentEmployees collection would be a cache entry
with the primary keys of all the Employee objects that belong to
that department.  So in all, you have a total of 600+1=601
objects, and therefore 601 items would be put into the cache on
loading this collection, or 601 cache puts.

Using a lateral cache, this would result in 601 objects being
serialized and sent on the network.  This seems an expensive
price for placing this collection in a distributed cache.

Worse, even if only cache invalidate messages would be sent on
new puts, then couldn't the following happen:

1) A servlet on Server A loads the collection from db.  601
invalidate messages are sent to server B.
2) a short time after, as servlet on Server B also loads the
same collection.  This would also result in getting the data
from the db (since there is no serialization.)  This would also
produce 601 invalidate messages, now to server A.
3) a short time after that, Server A loads the same collection
as in (1) again.  It would no longer be in the cache, since step
(2) invalidated it, and would have to go to the database again
to get it!!!

In other words, you would end up with a cache where an entry is
only valid if that entry is only used on one and only one
server, say if Server A is the only user of those entries, but
if server B ever tries to use it, it would get invalidated! 
This would seem to make the cache have very limited use.

Another concern I have, say the collection is changed on server
A, which produces 601 messages, and server B starts to process
the messages, but before all the messages are processed a
servlet on server B reads the collection.  If this happened
after only 250 messages are processed, the servlet would get a
collection with 250 new objects and 350 old objects.  This would
be an unacceptable situation where you would have inconsistent
data.  In the employee example, you would have a collection with
250 employees with up-to-date info but the rest have stale info.

I am starting to wonder if JCS is the right tool for distributed
caching, at least in my application.  Am I misunderstanding how
all this works?  Perhaps a transactional cache is really needed
for such needs, so there would not be a risk of a mixed bag of
old and new data.
  
> It would be madness to use jboss cache in a big
> cluster in locking mode.  Using jgroups for
> distributed locking is not scalable.  

I don't expect my app to ever be on more than a few servers,
maximum 7.  I think this is an acceptable number, though I do
not like the fact that each collection retrieval would result in
all that serialization and network traffic, as I described
above.  My application frequently needs to retrieve collections
of up to 1000 objects.  But I am researching whether I really
should be using JBoss cache.  (by the way my app has nothing to
with departments and employees, that was only an example)

> 
> Also, if you need that degree of data integrity, you
> don't need a cache, you need a database.

True, but part of the reason to use a cache is for speed.  It is
expensive to always have to reload a collection of 600-1000
rows, so it is a perfect candidate for caching, but that is not
acceptable if it is at the cost of having invalid data.

> 
> You write the plugin, send it to us, and I'll put it
> in a plugin jar along with the struts plugin. 

I may do just that, though the code would be pretty much the
same as it is in the JCS plugin in the Hibernate 2.1.x source
tree.  I could convert it to a JCS package and send it.  I'll
try to do this when I get a chance.

> 
> You can definitely set different values for different
> servers.

Does anyone know how?  As far as I understand it, server
instances in a Weblogic cluster utilize a cluster-wide JNDI
tree, so how could you configure a separate value for each node?
 
> Either way, I think the remote cache is a better
> option.  It solves your configuration problems, since
> all the local caches can have the same settings.  It
> does require that you run a separate process, but it
> is a better model overall. 

That may be, but I doubt the managers and admins at my company
would allow another process or server to run besides for the web
servers.

Thanks for all your help once again.

Daniel

> 
> Aaron
> 
> > Thanks in advance,
> > Daniel
> > 
> > 
> > --- Aaron Smuts <asmuts@yahoo.com> wrote:
> > 
> > > Hi Daniel.
> > > 
> > > A removeAll sends one message, not a message for
> > each
> > > item.  
> > > 
> > > Jgroups is fine, but it is slower than the other
> > > options.  
> > > 
> > > I'd use tcp lateral connections or the remote rmi
> > > server.
> > > 
> > > If you use the tcp lateral, then you need to
> > specify
> > > the servers a cache should connect to.  If you
> > have 3
> > > servers, A, B, and C, then A should point to B and
> > C, 
> > > B should point to A and C, . . .
> > > 
> > > The problem is that this would require that you
> > have a
> > > different war for each app.  
> > > 
> > > There is asolution.  Use JNDI and a startup
> > servlet. 
> > > Set the server list as a value in applicaiton
> > context
> > > through the container.  Make a startup servlet
> > that
> > > configures JCS based on a properties object.  Load
> > the
> > > cache.ccf file and change the values you need. 
> > Then
> > > configure JCS with this. use the
> > > CompositeCacheManager. 
> > > 
> > > This way you can deploy the same war to multiple
> > > servers.  
> > > 
> > > 
> > > Aaron
> > > 
> > > --- Daniel Rosenbaum <drdan321-nospam@yahoo.com>
> > > wrote:
> > > 
> > > > Hello,
> > > > 
> > > > JCS seems to have come a long way since about a
> > year
> > > > ago.  Could
> > > > I assume most of the bugs were fixed that the
> > > > Hibernate project
> > > > was reporting?
> > > > 
> > > > Anyhow, I am thinking about using JCS for it's
> > > > clustered cache
> > > > capability.  I hope the community would not mind
> > > > giving me a few
> > > > pointers how to set this up properly, and some
> > > > insight as to
> > > > what configuration options are best for me.
> > > > 
> > > > I have a web app running on Weblogic that is
> > > > currently clustered
> > > > on two servers but may go up to 5 servers or
> > more. 
> > > > The data I
> > > > cache is mostly read only but changes
> > occasionally. 
> > > > I don't
> > > > care so much about sharing data between servers
> > but
> > > > am more
> > > > concerned about not serving stale data, so I
> > would
> > > > be happy just
> > > > to send an invalidate message to the rest of the
> > > > servers on an
> > > > element change so they would not serve stale
> > data.
> > > > 
> > > > I am trying to make heads or tales of the docs
> > but
> > > > find them
> > > > difficult to understand.  As far as I could tell
> > > > though a
> > > > lateral cache would suit my needs best.  
> > > > 
> > > > 1) Since I am not so concerned about sharing the
> > > > data I figure a
> > > > remote cache is not so important, plus it seems
> > for
> > > > a remote
> > > > cache I would need to start another process
> > besides
> > > > for the web
> > > > servers.  This seems like overkill.  Am I
> > correct
> > > > though that
> > > > for a lateral cache I would not need to start
> > > > another process? 
> > > > But if so I am confused how the "listener" gets
> > > > started for a
> > > > lateral cache, what sets up binding of the port
> > etc.
> > > >  The doc
> > > > was not clear on this.
> > > > 
> > > > 2) I am unclear how I would specify servers in
> > the
> > 
> === message truncated ===
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: turbine-jcs-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: turbine-jcs-user-help@jakarta.apache.org


Mime
View raw message