cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sylvain Wallez <sylvain.wal...@anyware-tech.com>
Subject Re: Adding Resource Monitor to Generators
Date Mon, 10 Dec 2001 15:43:50 GMT


Gerhard Froehlich a écrit :
> 
> Gerhard,
> >From: Sylvain Wallez [mailto:sylvain.wallez@anyware-tech.com]
> >
> >Gerhard Froehlich a écrit :
> >>
> >> Hi,
> >> I started to implement the Resource Monitor into
> >> the caching process.
> >>
> >> One place are the Generators (FileGenerator, etc..), therefore
> >> I have to lookup up the Monitor component.
> >> Where shall I place the contextualize method in the Generator
> >> derivation tree:
> >>
> >> ComposerGenerator
> >>   AbstractGenerator
> >>     AbstractXMLProducer
> >>
> >> ?
> >>
> >> I ask, 'cause I don't want to rain in somebodies "Generator" party ;)
> >>
> >> Cheers
> >> Gerhard
> >>
> >Gerhard,
> >
> >I'm happy you start working on Monitors : reducing filesystem lookup is
> >a must-do to increase performance. But I have a few wonders about the
> >way to introduce them in the engine.
> >
> >I didn't went deep in Monitors, but is it good to use an active monitor
> >? From what I understand, an active monitor scans periodically (10 secs
> >in cocoon.xconf) all its resources.
> 
> That's correct.
> 
> >This means that every 10 secs, Cocoon will scan *each and every* file
> >monitored since the engine startup, even those that are unfrequently
> >used. I'm afraid this will be worse than what we have today on large
> >sites... but tell me if I'm wrong !
> 
> Ok I will quote Berin as answer:
> "...That way during extreme load conditions the number of times we call
> the "lastModified" method doesn't change. Instead of 1/request
> (with 200 simultaneous users requesting 4 pages a second that comes to
> 800 calls a second) it is once per period of time.  Even at one second,
> you have called "lastModified" 1/800th of the time using the afforementioned
> example. It defaults to once per minute which is 1/24000th of the time (that
> is 2400000% decrease in calls..."
> 
> IMHO that sound reasonable.

Sure this is reasonable, and it clearly shows the need to avoid
systematic calls to the filesystem.

But here's another use case : suppose you have a portal site where 10%
of the pages make 90% of the requests, but where 100% of the pages are
monitored, because they have been visited at least once since server
startup. I'm not good at statistics, but lets suppose this results in
only 15% of all pages to be requested in a monitor scan period.

This means that at each refresh period, ActiveMonitor will scan 100% of
the pages when only 15% are really needed to be scanned : 85% of useless
File.getLastModified() !! This is what I wanted to point out.

> >It seems to me that the main benefit of ActiveMonitor is for resources
> >that are systematically checked at each and every request : IMO, this
> >should be limited to configuration files and sitemaps.
> 
> Yes and also systematically checked are the Generators. On every request
> the CachingStreamPipeline validates TimeStampCacheValidity. TimeStampCacheValidity
> is set i.e. from the FileGenerator, which calls getLastModified() every
> request. TimeStampCacheValidity signals if a Source (i.e sample.xml)
> has changed or not.
> 
> >For less-frequently used resources, wouldn't it be a better solution to
> >only call getLastModified() when the resource is actually used and the
> >time since the last call to getLastModified() is greater than the
> >refresh period ? This would be a kind of buffering in front of the
> >filesystem. Also, can't this be integrated directly in Source ?
> 
> Hmm I don't get you here ;).

Ok, maybe some code will be more clear ;)

public class FileResource ... {
  private File;
  private long cachedLastModified = 0;
  private long nextCheckTime = 0;
  private long refreshPeriod = 10000; // configurable

  public FileResource(File file) {
    this.file = file;
    refresh();
  }

  public long getLastModified() {
    if (System.currentTimeMillis() > nextCheckTime) {
      refresh();
    }
    return cachedLastModified;
  }

  private void refresh() {
    nextCheckTime = System.currentTimeMillis() + refreshPeriod;
    cachedLastModified = file.lastModified();
  }
}

This ensures the LastModified information isn't older that the refresh
period, and that - and this what I wanted to explain - refresh occurs
only when the information is actually requested.

If a FileResource is used once a week, File.lastModified() will be
called only once a week even if refreshPeriod is 1 hour, while in the
same conditions ActiveMonitor will call it 24*7 = 168 times !

Note also that the above algorithm can really easily be integrated into
AbstractSource.

Thoughts ?

> >Last point : your changes in ProgramGenerator make the assumption that
> >sources are files. This won't be true in unexpanded war files and will
> >very likely break the engine ;)
> 
> But how works the FileGenerator and his related, when they call
> getLastModified() from the Source in this case?

Using URLConnection.getLastModified(), whose abilities highly depend on
the protocol handler.

> Cheers
> Gerhard

Sylvain.
-- 
Sylvain Wallez
Anyware Technologies - http://www.anyware-tech.com

---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org


Mime
View raw message