Return-Path: X-Original-To: apmail-perl-modperl-archive@www.apache.org Delivered-To: apmail-perl-modperl-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id BE48B17ABE for ; Tue, 3 Feb 2015 16:38:49 +0000 (UTC) Received: (qmail 52431 invoked by uid 500); 3 Feb 2015 16:38:50 -0000 Delivered-To: apmail-perl-modperl-archive@perl.apache.org Received: (qmail 52397 invoked by uid 500); 3 Feb 2015 16:38:50 -0000 Mailing-List: contact modperl-help@perl.apache.org; run by ezmlm Precedence: bulk list-help: list-unsubscribe: List-Post: List-Id: Delivered-To: mailing list modperl@perl.apache.org Received: (qmail 52387 invoked by uid 99); 3 Feb 2015 16:38:49 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 03 Feb 2015 16:38:49 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of pharkins@gmail.com designates 209.85.212.177 as permitted sender) Received: from [209.85.212.177] (HELO mail-wi0-f177.google.com) (209.85.212.177) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 03 Feb 2015 16:38:25 +0000 Received: by mail-wi0-f177.google.com with SMTP id r20so23064984wiv.4 for ; Tue, 03 Feb 2015 08:38:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=gW8YMWQsLmGDkSjMiauuPhLXHAa7fQXeoANp3ld6w/k=; b=T0mR+yITmpoKMK01jZDxxGEce6OMPYx6bZLKrf70f+aFPvQREpya/djmfFklqruqac g8VjyqDbacxTq35BTC+x5vy99euMNc/PNLmVDjeECWr3XOXlLVLgBSJp/CSlGHcHtXha ar1CMt4NvF9weLpupD61v3Xm2xIZ+pz39s8U5ITrlb0W+9HGW6hDohBCSFK+YxJeWMER tiPGQB3z23g7bYdlSB3PU6J9VJ5czYBIm0uOjFRfNffwuzDiUAiXxT0ul5vuWQMGY5j7 /nWccisSLCyclHHAXPFzGOJpr++Y/jd4QilbSwOOyRh6jdLrTXaa2KTRrmk2Hw9lZDAO mYQw== MIME-Version: 1.0 X-Received: by 10.181.13.176 with SMTP id ez16mr35649652wid.78.1422981503479; Tue, 03 Feb 2015 08:38:23 -0800 (PST) Received: by 10.27.213.202 with HTTP; Tue, 3 Feb 2015 08:38:23 -0800 (PST) In-Reply-To: <54D0734F.1060308@streppone.it> References: <54D0734F.1060308@streppone.it> Date: Tue, 3 Feb 2015 11:38:23 -0500 Message-ID: Subject: Re: mod_perl for multi-process file processing? From: Perrin Harkins To: Cosimo Streppone Cc: Alan Raetz , mod_perl list Content-Type: multipart/alternative; boundary=f46d043c7cac098539050e31b364 X-Virus-Checked: Checked by ClamAV on apache.org --f46d043c7cac098539050e31b364 Content-Type: text/plain; charset=UTF-8 Cache::FastMmap is a great module for sharing read/write data, but it can't compete with the speed of loading it all into memory before forking as Alan said he plans to do. - Perrin On Tue, Feb 3, 2015 at 2:05 AM, Cosimo Streppone wrote: > Alan Raetz wrote: > > So I have a perl application that upon startup loads about ten perl >> hashes (some of them complex) from files. This takes up a few GB of >> memory and about 5 minutes. It then iterates through some cases and >> reads from (never writes) these perl hashes. To process all our cases, >> it takes about 3 hours (millions of cases). We would like to speed up >> this process. I am thinking this is an ideal application of mod_perl >> because it would allow multiple processes but share memory. >> > > Sure you could use modperl for this. > I would also consider at least these alternatives: > > - use Cache::FastMmap, https://metacpan.org/pod/Cache::FastMmap > Load up your data with a loader script, and forget about it. > Cache::FastMmap also works with modperl. > > - use a network server, like memcached or redis to store your > read-only data, and use a lightweight network protocol (on localhost) > to get the data. > > In both cases, reading from multiple processes will not be a problem. > The cheapest solution for the consumer part (the "cases" above) > would be to use a command like "parallel" to fire up as many copies > of your consumer script as you can afford. > > Hope this helps, > > -- > Cosimo > > --f46d043c7cac098539050e31b364 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Cache::FastMmap is a great module for sharing read/write d= ata, but it can't compete with the speed of loading it all into memory = before forking as Alan said he plans to do.

- Perrin

On Tue, F= eb 3, 2015 at 2:05 AM, Cosimo Streppone <cosimo@streppone.it> wrote:
Alan Raetz wr= ote:

So I have a perl application that upon startup loads about ten perl
hashes (some of them complex) from files. This takes up a few GB of
memory and about 5 minutes. It then iterates through some cases and
reads from (never writes) these perl hashes. To process all our cases,
it takes about 3 hours (millions of cases). We would like to speed up
this process. I am thinking this is an ideal application of mod_perl
because it would allow multiple processes but share memory.

Sure you could use modperl for this.
I would also consider at least these alternatives:

- use Cache::FastMmap, https://metacpan.org/pod/Cache::FastMmap
=C2=A0 Load up your data with a loader script, and forget about it.
=C2=A0 Cache::FastMmap also works with modperl.

- use a network server, like memcached or redis to store your
=C2=A0 read-only data, and use a lightweight network protocol (on localhost= )
=C2=A0 to get the data.

In both cases, reading from multiple processes will not be a problem.
The cheapest solution for the consumer part (the "cases" above) would be to use a command like "parallel" to fire up as many copi= es
of your consumer script as you can afford.

Hope this helps,

--
Cosimo


--f46d043c7cac098539050e31b364--