cocoon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gerhard Froehlich" <g-froehl...@gmx.de>
Subject RE: Adding Resource Monitor to Generators
Date Mon, 10 Dec 2001 15:57:52 GMT
>From: Sylvain Wallez [mailto:sylvain.wallez@anyware-tech.com]
>
>Gerhard Froehlich a écrit :
>>
>> Gerhard,
>> >From: Sylvain Wallez [mailto:sylvain.wallez@anyware-tech.com]
>> >
>> >Gerhard Froehlich a écrit :
>> >>
>> >> Hi,
>> >> I started to implement the Resource Monitor into
>> >> the caching process.
>> >>
>> >> One place are the Generators (FileGenerator, etc..), therefore
>> >> I have to lookup up the Monitor component.
>> >> Where shall I place the contextualize method in the Generator
>> >> derivation tree:
>> >>
>> >> ComposerGenerator
>> >>   AbstractGenerator
>> >>     AbstractXMLProducer
>> >>
>> >> ?
>> >>
>> >> I ask, 'cause I don't want to rain in somebodies "Generator" party ;)
>> >>
>> >> Cheers
>> >> Gerhard
>> >>
>> >Gerhard,
>> >
>> >I'm happy you start working on Monitors : reducing filesystem lookup is
>> >a must-do to increase performance. But I have a few wonders about the
>> >way to introduce them in the engine.
>> >
>> >I didn't went deep in Monitors, but is it good to use an active monitor
>> >? From what I understand, an active monitor scans periodically (10 secs
>> >in cocoon.xconf) all its resources.
>>
>> That's correct.
>>
>> >This means that every 10 secs, Cocoon will scan *each and every* file
>> >monitored since the engine startup, even those that are unfrequently
>> >used. I'm afraid this will be worse than what we have today on large
>> >sites... but tell me if I'm wrong !
>>
>> Ok I will quote Berin as answer:
>> "...That way during extreme load conditions the number of times we call
>> the "lastModified" method doesn't change. Instead of 1/request
>> (with 200 simultaneous users requesting 4 pages a second that comes to
>> 800 calls a second) it is once per period of time.  Even at one second,
>> you have called "lastModified" 1/800th of the time using the afforementioned
>> example. It defaults to once per minute which is 1/24000th of the time (that
>> is 2400000% decrease in calls..."
>>
>> IMHO that sound reasonable.
>
>Sure this is reasonable, and it clearly shows the need to avoid
>systematic calls to the filesystem.
>
>But here's another use case : suppose you have a portal site where 10%
>of the pages make 90% of the requests, but where 100% of the pages are
>monitored, because they have been visited at least once since server
>startup. I'm not good at statistics, but lets suppose this results in
>only 15% of all pages to be requested in a monitor scan period.
>
>This means that at each refresh period, ActiveMonitor will scan 100% of
>the pages when only 15% are really needed to be scanned : 85% of useless
>File.getLastModified() !! This is what I wanted to point out.

I think you're concentrating to much on the amount of real files. But the
problem we are speaking of is high system load caused of many requests.
Maybe 1000 calls per second. From this viewpoint it doesn't matter if you have
one page or thousand. That are 1000 File.getLastModified() calls per second.
With monitored by i.e a component like the Resource Monitor, it would be
500 per 8 hours assuming you have 500 files in the Resource Monitor.

>> >It seems to me that the main benefit of ActiveMonitor is for resources
>> >that are systematically checked at each and every request : IMO, this
>> >should be limited to configuration files and sitemaps.
>>
>> Yes and also systematically checked are the Generators. On every request
>> the CachingStreamPipeline validates TimeStampCacheValidity. TimeStampCacheValidity
>> is set i.e. from the FileGenerator, which calls getLastModified() every
>> request. TimeStampCacheValidity signals if a Source (i.e sample.xml)
>> has changed or not.
>>
>> >For less-frequently used resources, wouldn't it be a better solution to
>> >only call getLastModified() when the resource is actually used and the
>> >time since the last call to getLastModified() is greater than the
>> >refresh period ? This would be a kind of buffering in front of the
>> >filesystem. Also, can't this be integrated directly in Source ?
>>
>> Hmm I don't get you here ;).
>
>Ok, maybe some code will be more clear ;)
>
>public class FileResource ... {
>  private File;
>  private long cachedLastModified = 0;
>  private long nextCheckTime = 0;
>  private long refreshPeriod = 10000; // configurable
>
>  public FileResource(File file) {
>    this.file = file;
>    refresh();
>  }
>
>  public long getLastModified() {
>    if (System.currentTimeMillis() > nextCheckTime) {
>      refresh();
>    }
>    return cachedLastModified;
>  }
>
>  private void refresh() {
>    nextCheckTime = System.currentTimeMillis() + refreshPeriod;
>    cachedLastModified = file.lastModified();
>  }
>}
>
>This ensures the LastModified information isn't older that the refresh
>period, and that - and this what I wanted to explain - refresh occurs
>only when the information is actually requested.
>
>If a FileResource is used once a week, File.lastModified() will be
>called only once a week even if refreshPeriod is 1 hour, while in the
>same conditions ActiveMonitor will call it 24*7 = 168 times !

See my comments above. When you have 1000 request a second then you call
at nextCheckTime 1000 times getLastModified().

>Note also that the above algorithm can really easily be integrated into
>AbstractSource.
>
>Thoughts ?
>
>> >Last point : your changes in ProgramGenerator make the assumption that
>> >sources are files. This won't be true in unexpanded war files and will
>> >very likely break the engine ;)
>>
>> But how works the FileGenerator and his related, when they call
>> getLastModified() from the Source in this case?
>
>Using URLConnection.getLastModified(), whose abilities highly depend on
>the protocol handler.

Aha I see

  Gerhard


"Eagles may soar, but weasels don't get
sucked into jet engines.
(Todd C. Somers)"



---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org


Mime
View raw message