perl-modperl mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tina Mueller <apa...@s05.tinita.de>
Subject Re: Ways to scale a mod_perl site
Date Fri, 18 Sep 2009 08:42:30 GMT
On Wed, 16 Sep 2009, Igor Chudov wrote:

> On Wed, Sep 16, 2009 at 11:05 AM, Michael Peters <mpeters@plusthree.com>wrote:
>
>> Reducing DB usage is more important than this. Also, before you go down
>> that road you should look at adding a caching layer to your application
>> (memcached is a popular choice).
>>
>>
> It is not going to be that helpful due to dynamic content. (which is my
> site's advantage). But this may be useful for other applications.

That's a common misconception, I think. Even if a website is completely
dynamic you can cache things.
First misunderstanding: people think about caching a whole HTML page.
In memcached you typically cache data structures.
As an example I will take my portal software. It has a forum, blog,
guestbook, it has a list of users who are online, the forum has a
"posts from the last 24 hours", and it has other things that are
shown with every request (new private messages, notes for moderators
about new forum threads, ...)

Now, Should I fetch the online users from the database with every request?
Should I fetch all the threads and authors of the last 24 hours whenever
somebody requests that page, even if nothing has changed?

First answer: the list of online users can be cached for, say 1 minute.
Nobody will care or even notice. Only if someone logs in you will expire
the entry in memcached. All other changes are not so important that
you cannot cache them for one single minute.

Make that 1 page view per second and you save 59 database requests per
minute.


Second: The list of the recent posts can be cached, let's say for 3 minutes.
The entry in memcached is only expired explicitly when somebody posts a new
thread/reply or a title is changed etc.

I believe *every* application has things like that which you can cache.
I know of a website that reduced its load dramatically when using
memcached. It's quite a big webseite (300 million Page views per month
locally (mostly requests from german speaking countries)).
But some people are reluctant to use memcached. One person said to me
"what if storing the data in memcached is more work than fetching it
from the database every time?" I don't know what to say. Try it out.
I know the example of the company who uses memcached.
Did you know faceboook uses memcached very very extensively?
If you're not sure: analyse your website usage. What kind of data
is fetched how often. Make a testcase and use memcached for that and
see what's faster.


Another thing you could do is: seperate your database schema.
Some tables do not connect to others. For example, my portal software
is modular, so that you can activate/deactive certain modules.
The easiest thing was to just create one (DBIx::Class) schema per
module. Of course, what connects all these is the user schema, and
because I cannot do joins to the user tables any more I might have
a request more here and there. But I can sepereate all these schemas
and put them all on their own database server. With every request,
you only need a part of all the schemas.
Typically the highest load is on the database so splitting the db to several
servers like this might be an option.

And last but not least: for searching the database, use a search engine.
KinoSearch works quite well, and there are also other search engines for perl.

regards,
tina

Mime
View raw message