perl-modperl mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex Beamish" <tal...@gmail.com>
Subject Re: Forking to an interactive program under mod_perl
Date Wed, 13 Dec 2006 16:25:23 GMT
On 12/13/06, Robert Landrum <rlandrum@aol.net> wrote:
>
> Alex Beamish wrote:
> > I'll deal with multiple documents with some combination of stale timers
> > and LRU slots, but that's not really what I see as the most complicated
> > or difficult part of this problem. For this particular application, my
> > inactivity timer will probably by 10-15 minutes, and I'll expect to have
> > 6-8 documents open at any given time, so it shouldn't be a big drain on
> > memory. And I will probably be able to set something up that signals
> > that a document has been expired as well .. (this is just me thinking
> > out loud) ..
> >
> > Thanks for your feedback .. I think named pipes is my next focus.
> >
>
> Sockets will be the way to go on this, rather than pipes.  I don't want
> to say pipes are a dead technology, but by using sockets, you can move
> your gs application off your web servers (if load ever gets that high)
> without having to rewrite any code, something that isn't possible with a
> pipe (not without netcat, anyway).


I'll think about sockets, thanks.

There's probably a good reason for it, but why not just pre-generate all
> of your page images?  Even when new documents are added, a cron could be
> setup to come along (once a minute even) and convert those PDFs to
> images.  Are these PDFs dynamically generated?


(The drawbacks of providing too little information .. or too much.)

The current solution is that we generate high resolution page images for the
documents in question, and then resize them on the fly using mod_perl.

This solution works well locally, but the problem is that we also have a
satellite office, and the VPN between the offices is not that great -- we
get between 50K and 100K in bandwidth. When we're dealing with tens of
thousands of pages, that's a fair bit of data to pass back and forth and to
store. I currently have a daemon process that takes care of asynchronously
rsyncing the page images to the satellite office.

A better solution would be to generate the page images as needed, by using
mod_perl again, but this time running Ghostscript on the document.

The problem is that Ghostscript is an interactive program, so we need two
way communication with it. I already have technology that very efficiently
spawns programs using a double fork and directs stdout and stderr into files
(I use that to do the document rsyncing), but that's no good for this
application -- I need to start a Ghostscript session for a document, and
keep it running so that as each page image request comes in, I can just
forward it to the appropriate session.

And I don't want to have to start up Ghostscript fresh for each session -- I
can afford some memory overhead as long as I can make a request and get a
response from a live Ghostscript session. Thanks to the discussion that my
original post has engendered, I think the answer may lie in using
pseudo-ttys, and I'm going to re-visit some of my original research to see
if that's the case.

I didn't want to describe the existing system in too much detail in case all
the detail was unnecessary and irrelevant. So now you have all of the detail
after all -- I hope it answers your questions.

-- 
Alex Beamish
Toronto, Ontario
aka talexb

Mime
View raw message