perl-modperl mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Vanasco <jvana...@2xlp.com>
Subject Re: Global question
Date Sat, 19 May 2007 22:53:12 GMT

my .02¢

	•  ldap would be silly unless you're clustering -- most  
implementations use bdb as their backend
	•  bdb and cache::fastmmap would make more sense if you're on 1 machine

also

i think your hash system may be better off rethought....

you have:
	$CACHE_1{id}='foo'
	$CACHE_2{ida}{idb}='bar'
	
which limits you to thinking in terms of perl hash structures...

if you emulate that with flattended keys
	cache_1_[\d+]
	cache_2_[\w+]_[\w+]

then you have a lot more options.  in a clustered system, you have  
memcached or a dedicated mysql / whatever daemon
	
one of my projects , RoadSound, is a collaborative content management  
system where each 'view' into a set of data  is from the independent  
perspective of both each relevant entities  and content manager.   
loosely translated -- to display the most basic details of a  
concert,  i need to do 4 15-20 table joins in postgres -- and I need  
to do & store that seperately for each artist / venue / whatever  
involved.  in order to offload the db, i store everything in  
memcached as I generate it with a key like: """show_%(id)s_% 
(perspective_type_id)s_%(perspective_owner_id)""".  it doesn't  
perform nearly as fast using shared memory , but it offloads A TON of  
work from my db and works across multiple machines.

the only issue to this approach would be clearing out  the 'y' level  
in this model: $CACHE_2{$y}{$z}  .  i don't know if that is a concern  
for you or not, but that could create issues.

also- depending on your current performance, you might be able to  
just use mysql as well.  you could conceivably do something that  
takes advantage of the speed of memory or myisam tables and select  
query caching.  while that wouldn't be as fast as using memory  
alone , it clusters.



On May 19, 2007, at 6:13 PM, Will Fould wrote:

> Thanks a lot Perrin -
>
> I really like the current method (if it were to stay on 1 machine  
> and not grow). Caching per child has not really been a problem once  
> I got beyond the emotional hangup of what seemed to be duplicative,  
> waste of memory.  I am totally amazed how fast and efficient using  
> modperl in this way has been. The hash building queries issued by  
> the children are very simple selects but the data provided by (and  
> cached within) them is used in many ways throughout the session  
> such that not having them would require extra joins in multiple  
> places and queries in other places that are currently not needed at  
> all. -- ( i.e. collaborative environment ACL's etc.).  To be clear,  
> the hashes are not only for quick de-normalizing, but they serve a  
> vital caching function.
>
> The problem is that I am now moving the database off localhost and  
> configuring a second web node now.
>
> > what it is that you don't like about your current method.
>
> I'm afraid that:
>    1. hashes get really big (greater than a few MB's each)
>    2. re-caching entire hash just b/c 1 key updated (waste).
>    3. latency for pulling cache data from remote DB.
>    4. doing this for all children.
>
> For now, what seems like the 'holy-grail' (*) is to cache  
> last_modified for each type, (available to the cluster, say through  
> memcached), in a way that indicates only which parts of the cache  
> (which keys of each hash) the children need to update/delete such  
> that a child rarely, if ever, will only need to query for just  
> those keys and directly modify their own hashes accordingly to keep  
> current.
>
> (*) I'm not too clear about this, but it seems like the real 'holy- 
> grail' would be to do this within apache in a scoreboard like way.
>
> -w
>
>
> On 5/19/07, Perrin Harkins <perrin@elem.com> wrote: On 5/19/07,  
> Will Fould <willfould@gmail.com> wrote:
> > Here's the situation:  We have a fully normalized relational  
> database
> > (mysql) now being accessed by a web application and to save a lot  
> of complex
> > joins each time we grab rows from the database, I currently load  
> and cache a
> > few simple hashes (1-10MB) in each apache processes with the  
> corresponding
> > lookup data
>
> Are you certain this is saving you all that much, compared to just
> doing the joins?  With proper indexes, joins are fast.  It could be a
> win to do them yourself, but it depends greatly on how much of the
> data you end up displaying before the lookup tables change and have to
> be re-fetched.
>
> > Is anyone doing something similar? I'm wondering if implementing  
> a BerkleyDB
> > or another slave store on each web node with a tied hash (or  
> something
> > similar) is feasible and if not, what a better solution might be.
>
> Well, first of all, I wouldn't feed a tied hash to my neighbor's dog.
> It's slower than method calls, and more confusing.
>
> There are lots of things you could do here, but it's not clear to me
> what it is that you don't like about your current method.  Is it that
> when the database changes you have to do heavy queries from every
> child process?  That also kills any sharing of the data.  Do you have
> more than one server, or expect to soon?
>
> - Perrin
>

// Jonathan Vanasco

| - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -  
- - - - - - - - - - - - - - - - - - -
| SyndiClick.com
| - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -  
- - - - - - - - - - - - - - - - - - -
|      FindMeOn.com - The cure for Multiple Web Personality Disorder
|      Web Identity Management and 3D Social Networking
| - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -  
- - - - - - - - - - - - - - - - - - -
|      RoadSound.com - Tools For Bands, Stuff For Fans
|      Collaborative Online Management And Syndication Tools
| - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -  
- - - - - - - - - - - - - - - - - - -



Mime
View raw message