perl-embperl mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ed Grimm <edgr...@dsblade00.wat.us.ray.com>
Subject Re: What's going on with Embperl???
Date Fri, 19 Sep 2003 23:20:52 GMT
On Thu, 18 Sep 2003, Ruben I Safir wrote:
> On 2003.09.17 20:59 Donovan Allen wrote:
>> Ruben Safir Secretary NYLXS wrote:
>>
>>>Yes - In my experience I had a JSP programmer test his code for a scheduling
>>>function against my code in EMBPerl and modules.  After the first hit,
>>>I was much faster than he was.  I've done similar things with PHP coders
>>>and even coded the PHP myslef.
>>>
>>>After the first hit, and when EMBPERL has it's logging level lowered,
>>>is was outstandingly fast, especially with large data grabs, and pages
>>>which were repeated.  OTOH, my HTTPD processes grow large...and I'm thrilled
>>>to death about that.
>>>
>> Like leaking memory large or just larger?  Unfortunately, the latter, as
>> you well know, is going to happen to some degree with any library cache
>> technology.
>
> Just large.  I've never had a memory leak in any Perl.  I've had canversations
> with the Perl Developers and it seems to be all but impossible.  The causes of most
> emory "leaks" in Perl is with the code where you keep creating objects and never
> scope them out of the program.  Misunderstanding of scope is a huge program with
> mod_perl, because the damn program never ends.
>
> But the http processes grow, and then stop growing at a point.

Assuming, of course, that you're not subject to exceedingly large data
returns into a web page, on an OS that doesn't free(2) back to the OS.
(That is, this can bite one on Solaris, but not on Linux.)

>> I also have come across many CPAN modules that leak memory, which is
>> really great in mod_perl or other daemon scripts.
>
> I am not aware of any way for Perl to leak memory.  But if you show me the modules,
> id be happy to pass them to Mike.
>
> Memory leakage is when a pointer to an allocated memory location is removed from
> scope prior to manual deallocation of the memory.  Or it can be when memory is
> allocated and data is stored past the allocation.  In GNU/Linux, this would cause
> a segmentaiton fault by the program.  Of corse, people have smashed stacks, but
> these exploits almost always involve standard self allocating C routines such as
> scanf, etc.
>
> Now Perl has no pointers.  It has a garbage collection system.  When programmer do not
> use scope correctly, you can get a condition (and this happens all too frequently), when
> an increasing number of data objects are created in global name space, either through
> passed references which are attached to global variables, or other stupidities of this
> sort.

Perl has references, which are effectively pointers.  It has a garbage
collection system, which can be fooled.  The simplest example of which
I'm aware which I know works in 5.6.1 is:

sub leakit () {
    my %foo;		# 1
    $foo{bar} = \%foo;	# 2
    return 0;
}			# 3,4

1. A new hash is created, and a reference to it is created under the
lexically scoped name %foo; its reference counter increases to one.
2. %foo is populated with one record, and that record contains a
reference to %foo.  %foo's reference counter increases to two.
3. Subroutine ends, %foo goes out of scope, and its reference counter is
reduced by one.
4. Since the hash that was formerly known as %foo has a reference
counter of one, it is not freed.  Congratulations, you've leaked memory.

It's possible that 5.8.x has a fix for this method of leaking memory; I
know that someone was working on garbage collection loop detection.
Last I remembered hearing about it, there were still issues, but that
was back when 5.6.1 was relatively new.

The best way to avoid this type of memory leak is to simply not have
circularly referenced data structures.  Most people do this without even
trying.


Another way for a perl module to leak is to utilize object code that
leaks; there are several perl modules that do this, and last I knew,
mod-perl also did this (it had a leak in its reload code; it would not
free memory associated with any old versions of compiled subroutines.
This would only happen when apache received a HUP signal, and only if it
was set to reload its perl modules in response to such an event.)


A third way I've encountered, also only demonstrated in 5.6.1, is
through the use of eval, but the instance in which I encountered it was
convoluted enough that isolating the minimal requirement was not
feasible.  I was able to verify that if I avoided the eval, I didn't
have the memory leak.  In the one instance where I couldn't avoid the
eval, if I reduced it to simplest necessary form, I didn't have the
leak.

> USE STRICT in your modules (but don't use them in embperl though).

A good idea, but so far I haven't had it help me catch uncleaned
variables.

> Make sure everything is scoped to a namespace, function or block.
>
>> No, I mean using faster ways with standard perl that use less "magic"
>> and are closer to just a c library call.  Most perl programs and
>> programmers are frought with using the easier to use functions and
>> then abandon the language for a project where performance was bad.
>> In most cases like this where I took a look at what they were doing,
>> I was able to get the performance much much closer to C equivalent
>> speeds.
>
> While I've seen many discussions on optimized hashing and psuearrays
> etc, I would argue that in most cases, if the objects are constructed
> adequately, they will still out pace other platforms in modperl as the
> objects are created only once.  Then there is only the issue of doing
> the hashing.  It's a mnor hit comaped to the standard PHP data object
> and the Java Engine.
>
> If you understand what your doing and why, it is probibly that only
> direct C modules will outpace Perl Code in Embperl, or just in
> mod_perl, and yet you will still have highly maintainable code which
> is easy to bugtrack and extend.

Note that this assumes you're choosing your best option, as well.
Perl's many ways of doing things all have a time that is ideal for their
use; many of them have times when they are ill-advised.  For example,
perl has an option which allows you to read an entire file in a single
<FILE> operation and store it in a scalar.  This works well if you have
enough memory, and you're going to simply write it out again.
Performing substitutions on it, however, is a bad idea.

Ed

---------------------------------------------------------------------
To unsubscribe, e-mail: embperl-unsubscribe@perl.apache.org
For additional commands, e-mail: embperl-help@perl.apache.org


Mime
View raw message