corinthia-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From jan i <j...@apache.org>
Subject Re: Checking malloc success and adding perror()
Date Wed, 25 Feb 2015 09:02:38 GMT
On 25 February 2015 at 01:18, Gabriela Gibson <gabriela.gibson@gmail.com>
wrote:

> I tried this on my lappie that has 2GB RAM, 3GB swap on 32-bit Ubuntu
> Trusty:
>
> count = 3056
> malloc returned NULL
>
> No slowing down took place and repeated runs always stopped at 3056(same
> result before and after after closing my bloated chrome), making me think
> that the OS maybe has some kind of resource limit for processes set.
>

I think both you and Peter are  forgetting to look at the basics.

When infra makes a vm (which I did a lot), we never define page/swap space,
this limits the kernels possibility to expand, and as a consequence malloc
returns NULL.

It is the same on windows, if you remove the page file (or limit it, as
microsoft suggest, to the size of the Phys ram), malloc will return NULL.
the algorithm for windows is a bit different, during startup all device
drivers and services are loaded in non-pageable memory, the rest is
available, on a naked windows 7 that is about 60% and on windows8.1 about
40%.

But you also forgot one thing, we had several purposes to do use xmalloc.
One was to catch errors (if any), the second one which is just as
important, is that a central function allows us to have our own memory
manager if we want to, and maybe a intelligent cleanup scheme.

If nobody disagrees to make xmalloc() then let us get it made, and then
later (much later) discuss

>
> G
>
> On Tue, Feb 24, 2015 at 5:35 PM, Peter Kelly <pmkelly@apache.org> wrote:
>
> > Here’s a fun fact: On Linux and OS X, malloc never actually returns NULL.
> >
> > Out of curiosity, I thought I’d try the following program on Linux:
> >
> > #include <stdio.h>
> > #include <stdlib.h>
> >
> > int main()
> > {
> >     int count = 0;
> >     while (1) {
> >         void *p = malloc(1024*1024);
> >         if (p == NULL) {
> >             printf("malloc returned NULL");
> >             return 1;
> >         }
> >         else {
> >             count++;
> >             printf("count = %d\n",count);
> >         }
> >     }
> > }
> >
> > On a 64-bit Ubuntu VM with 2GB of memory it allowed be to “allocate" a
> > total of 569 Gb of memory. When allocation eventually failed, malloc
> didn’t
> > return anything - the kernel simply terminated the process.
> >
> > Then onto OS X. My Laptop has 16Gb of memory, but it was quite happy to
> > hand out more than 150 Gb. It would have kept going but it showed no sign
> > of reaching a limit (but allocations were getting slower), so I stopped
> it.
> >
> > What the program actually has are *pages*, not actual physical memory.
> The
> > process ends up with large amounts of address space available, but memory
> > is only ever *actually* allocated when we try to write to it.
> >
> > If I modify the code to actually write to all of the allocated memory, on
> > the Linux VM I got to 3.3 Gb, noticeably slowing down around the 2 Gb
> mark
> > when it started hitting the page file. After 3.3 Gb, the kernel once
> again
> > killed the process. On my Mac, it kept going for a lot longer but I could
> > see my available disk space dropping, due to it clearly utilising the
> page
> > file.
> >
> > Windows did the right thing, eventually returning NULL after 1.8 Gb
> > (surprisingly, this was on a VM with 4 Gb total allocated, I would have
> > expected a bit more).
> >
> > So as far as Linux and OS X, any attempts we make to try and check for
> and
> > deal with a NULL return value from malloc are entirely pointless, as the
> > process will be killed before it’s even able to get to the error handling
> > code.
> >
> > —
> > Dr Peter M. Kelly
> > pmkelly@apache.org
> >
> > PGP key: http://www.kellypmk.net/pgp-key <
> http://www.kellypmk.net/pgp-key>
> > (fingerprint 5435 6718 59F0 DD1F BFA0 5E46 2523 BAA1 44AE 2966)
> >
> > > On 24 Feb 2015, at 11:04 pm, Dennis E. Hamilton <
> dennis.hamilton@acm.org>
> > wrote:
> > >
> > > I find this an interesting discussion although I think it would be
> > useful to separate it, and the architectural issues being raised, from
> the
> > proposal that Gabriela has made.
> > >
> > > The current situation is that there is no (or not enough) checking for
> > failed mallocs.  There is a proposal to deal with that in the short-term
> by
> > simply replacing those calls with code that does check and uses a
> mindless
> > error-reporting method for all of them.  That work is now going ahead.
> > >
> > > It is clear that this is not a production-quality fix.  It replaces a
> > NULL-pointer usage crasher with something that at least explains what the
> > crash is at a place closer to where the NULL-pointer arises.  And there
> is
> > nothing like a graceful failure being provided.
> > >
> > > This is a stop-gap, an useful one for what it accomplishes.  Providing
> a
> > production-quality result for software that end-users will be employing
> is
> > going to be quite a different matter, and we are probably looking at it
> at
> > too much of a micro-level.
> > >
> > > It would be good to stand back and look at exactly what behavior we do
> > want.  What do we want software that uses the libraries being offered to
> be
> > able to do in then event that there is a resource-exhaustion situation
> > detected in underlying code and what do we want to assure about the
> > resulting state that accompanies the reporting of such a situation.
> (I.e.,
> > is there anything recoverable by the software that uses the library, are
> > there likely memory leaks, etc.)
> > >
> > > Another question is whether or not we are willing to make a release
> that
> > employs the stop-gap, and how do we make that known to potential adopters
> > of the code.  Will we declare it alpha- or beta-level quality, or what?
> > >
> > > - Dennis
> > >
> > > -----Original Message-----
> > > From: Edward Zimmermann [mailto:Edward.Zimmermann@cib.de]
> > > Sent: Tuesday, February 24, 2015 04:08
> > > To: dev@corinthia.incubator.apache.org
> > > Subject: RE: Checking malloc success and adding perror()
> > >
> > > Answers mixed in...
> > >
> > >> Von: jan i [mailto:jani@apache.org]
> > >> Gesendet: Montag, 23. Februar 2015 20:25
> > >> An: dev@corinthia.incubator.apache.org
> > >> Betreff: Re: Checking malloc success and adding perror()
> > >>
> > >> On 23 February 2015 at 12:47, Edward Zimmermann
> > >> <Edward.Zimmermann@cib.de>
> > >> wrote:
> > >>
> > >>> Hi,
> > >>>
> > >>> Been sort-of out the the discussion-- was on vacation last week-- so
> > >>> excuse me, in advance, if I bring up a point already made.
> > >>>
> > >> hi again, nice to hear from you again.
> > >>
> > >>
> > >>>
> > >>> First of all.. Corthinia is supposed to be C++? If so we don't want
> > >> to
> > >>> use malloc. If its plain C, of course, malloc is probably our first
> > >>> choice for memory allocation.
> > >>>
> > >> No DocFormats (the library part) is strictly C99. The application of
> > >> top can be other languages.
> > >>
> > >
> > > Sure. SWIG....
> > >
> > >>
> > >>>
> > >>> With the issue of x= malloc(y). This gets more complicated. Linux,
> > >>> https://access.redhat.com/documentation/en-
> > >> US/Red_Hat_Enterprise_Linux
> > >>> /6/html/Performance_Tuning_Guide/s-memory-captun.html
> > >>>
> > >> we could, but I do not think we want to inerfere wih kernel
> parameters.
> > >>
> > >
> > > I was not suggesting that we muck with the kernel params... but need to
> > accept the behavior.
> > > Android, for example, sets, I think, by standard the value to 1. That
> > means that malloc will return a pointer even when there is absolutely no
> > RAM available-- and if the pointer space is depleted (recall most
> Androids
> > are 32-bit kernels) unleashes immediately the OOM killer.
> > >
> > >
> > >
> > >>
> > >>>
> > >>> Can we configure the OOM Killer to be a little nicer? Yes but really
> > >>> only process by process.. And when a process gets killed it's done
so
> > >>> in a very silent way.. so if someone is "not in the know" to spot the
> > >>> clues.. it can get quite mysterious why some programs stop working...
> > >>>
> > >>> But let us even pretend that we get a NULL... What is the watermark
> > >> to
> > >>> have gotten it? Can we recover? Any buffers we can quickly dispose
> > >> of?
> > >>> Recovery is not that easy!
> > >>>
> > >> You are some steps ahead, right now we replace malloc with xmalloc to
> > >> have a central place for error handling.
> > >>
> > >
> > > I'm arguing that a central place is the wrong place. Android, for
> > example, won't get there.
> > >
> > > What is wrong with having different allocate functions for the
> > allocation of different kinds of objects?
> > > What is wrong with having a bit of checking business logic in these
> > functions?
> > > Business logic? A function, for example, that wants to create a scratch
> > buffer might be smart and only create a buffer suitable to the amount of
> > free RAM around. Another object creator may want to have a pool.. and
> > another still just call malloc.
> > > When calling malloc is code such as
> > >
> > > if (((Ptr = (t *)malloc(size)) == NULL) {
> > >  /* do error handling? */
> > > }
> > >
> > > Really polluted?
> > > Don't we want to be able to distinguish between recoverable and
> > non-recoverable errors?
> > >
> > >
> > >
> > >>
> > >>>
> > >>> Should we pretend that we can get a NULL? Of course. It's good
> > >>> programming practice. Should we wrap malloc with an xmalloc for such
> > >>> testing? No. On systems where malloc might  return a NULL we should
> > >>> have for different objects alternative strategies for dealing with
an
> > >>> allocation failure. A routine, for example, that wants to create a
> > >>> scratch buffer of x length but could work, albeit slower, with less
> > >> we might make smaller. Etc etc. etc.
> > >>>
> > >> of course we need to check for that, try on a Android or IoS system to
> > >> load a huge documents, and you will most surely reach the limit.
> > >
> > > Sure. Lacking virtual memory--- but having a virtual memory arch....
> > > That is why we use mmap.
> > >
> > >>
> > >> in windows malloc only works well when you allocate in chunks of 4k
> > >> (NTFS size). But both in windows and linux calling both malloc and
> free
> > >> will cause a context switch.
> > >
> > > The Microsoft Low Fragmentation Heap allocator works really well with
> > sub 4k objects--- it is limited to chunks under 16k.
> > >
> > > http://illmatics.com/Understanding_the_LFH.pdf
> > >
> > >>
> > >> IoS and Andriod work with preemptive context switching so here it is
> > >> even more expensive.
> > >>
> > > ??? Android is Linux. When we run a native C code it's call libc
> > (BIONIC, a BSD-derived variant). For malloc they use Doug Lea's malloc.
> > >
> > >
> > >
> > >>
> > >>>
> > >>> I'd suggest we keep to malloc and IF NEEDED-- and only if and when
> > >>> needed--- we use a drop-in replacement (and chances are that we'll
> > >>> NEVER need it much less want one).
> > >>>
> > >> Right now we want to replace the malloc calls with xmalloc, so we do
> > >> not need tons of "if NULL" distributed in the code. Replacing malloc
> is
> > >> a second discussion.
> > >>
> > >
> > > What is wrong with if NULL?
> > >
> > >>
> > >>>
> > >>> Part of the problem is that we might have different "best" approaches
> > >>> for different operating systems. IOS, Android, Linux, BSD, ... making
> > >> "best"
> > >>> not really the "best" goal..
> > >>>
> > >> Well with my kernel knowledge, all of them benefit from us allocating
> a
> > >> chunk during startup, eating more if needed, and freeing it all when
> we
> > >> are finished. Please remember the typical use will be open a docment,
> > >> do something, save a document and stop the application.
> > >>
> > >
> > > This is an old discussion that has waged for decades. Designing
> software
> > that does not free memory I think would be a mistake especially when
> speed
> > is not the ultimate issue.
> > >
> > >
> > >>
> > >>>
> > >>> Perror?  No.  Calling directly a function that is intended to write
> > >> to
> > >>> a console is, in general, a bad thing.
> > >>>
> > >> Why do you see that as bad ? I thought it wrote to stderr which can be
> > >> redirected, but anyhow what do you suggest instead.
> > >
> > > Because it is not nice writing to a console. Sure you can redirect
> > stderr but how do you know if a 3rd party lib that gets used alongside
> the
> > code might too think about redirecting stderr to yet another place or
> even
> > closing it.. I'm seen all too often things go really wrong. It's just not
> > best practice-- I'd even call it "bad practice" in a shared library.
> > Libraries should return error codes and perhaps have a message
> sub-system--
> > or an interface to one. Stderr is perhaps a tad less ugly than stdout..
> BUT
> > we really really should NOT be using either..
> > >
> > >
> > >
> > >>
> > >> rgds
> > >> jan i.
> > >>
> > >>>
> > >>>
> > >>>
> > >>> -----Ursprüngliche Nachricht-----
> > >>> Von: Peter Kelly [mailto:pmkelly@apache.org]
> > >>> Gesendet: Donnerstag, 19. Februar 2015 13:41
> > >>> An: dev@corinthia.incubator.apache.org
> > >>> Betreff: Re: Checking malloc success and adding perror()
> > >>>
> > >>>> On 19 Feb 2015, at 7:06 pm, Dennis E. Hamilton
> > >>>> <dennis.hamilton@acm.org>
> > >>> wrote:
> > >>>>
> > >>>> +1 about a cheap check and common abort procedure for starters.
> > >>>>
> > >>>> I think figuring out what to do about cleanup and exception
> > >>>> unwinding,
> > >>> and even what exception handling to use (if any) is a further
> > >>> platform-development issue that could be masked with simple
> > >>> still-inlineable code, but needs much more architectural thought.
> > >>>
> > >>> I’m fine with us using wrapper functions for these which do the
> > >> checks
> > >>> - though please let’s use xmalloc, xcalloc, xrealloc, and xstrdup
> > >>> instead of
> > >>> DFPlatform* (it’s nothing to do with platform abstraction, and these
> > >>> names are easier to type). (as a side note we can probably cut down
> > >> on
> > >>> prefix usage a lot as long as we don’t export symbols; this was just
> > >>> to avoid name clashes with other libraries)
> > >>>
> > >>> In my previous mail I really just wanted to point out that by itself,
> > >>> this doesn’t really solve anything - the issue is in reality far
more
> > >>> complicated than a simple NULL pointer check.
> > >>>
> > >>> I can think of two ways we could deal with the issue of graceful
> > >> handling:
> > >>>
> > >>> 1) Allow the application to supply a callback, as Jan suggested
> > >>>
> > >>> 2) Adopt a “memory pool” type strategy where we create an memory
pool
> > >>> object at the start of conversion which tracks all allocations that
> > >>> occur between the beginning and end of a top-level API call like
> > >>> DFGet, and put setjmp/longjmp-style exception handling in these API
> > >> calls.
> > >>>
> > >>> The second approach is in fact already used to a limited extent with
> > >>> the DOM API. Every document maintains its own memory pool for storing
> > >>> Node objects (and the text values of nodes)… this is freed when the
> > >>> document’s retainCount drops to zero. I did this because it was much
> > >>> faster than traversing through the tree and releasing nodes
> > >>> individually (at least in comparison to have nodes as Objective C
> > >>> objects - the ObjC runtime was undoubtedly part of that overhead).
> > >>>
> > >>> —
> > >>> Dr Peter M. Kelly
> > >>> pmkelly@apache.org
> > >>>
> > >>> PGP key: http://www.kellypmk.net/pgp-key
> > >>> <http://www.kellypmk.net/pgp-key> (fingerprint 5435 6718 59F0
DD1F
> > >>> BFA0 5E46 2523 BAA1 44AE 2966)
> > >>>
> > >>>
> > >
> >
> >
>
>
> --
> Visit my Coding Diary: http://gabriela-gibson.blogspot.com/
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message