Mailing-List: contact users-help@trafficserver.apache.org; run by ezmlm
Precedence: bulk
Reply-To: users@trafficserver.apache.org
Received-SPF: pass (athena.apache.org: domain of jplevyak@gmail.com designates
 209.85.216.170 as permitted sender)
MIME-Version: 1.0
In-Reply-To: <4F392F1D.100@apache.org>
References: <20120213043252.GA63462@mail.ximad.com>
	<CAFKFyq6V_7usNHpqBEeQ-4bWL2jd5-eMT=EUCty5i-YO8oW0Og@mail.gmail.com>
	<20120213082902.GB64585@mail.ximad.com>
	<1329123751.11470.5.camel@zym6400>
	<4F392F1D.100@apache.org>
Date: Mon, 13 Feb 2012 09:01:53 -0800
Message-ID: 
 <CACgLx2oGykfz49Xi5pbxztfn9W3DNRVt6EXSnK=NxdLmZyW0Sg@mail.gmail.com>
Subject: Re: ATS freebsd memory usage
From: John Plevyak <jplevyak@gmail.com>
To: users@trafficserver.apache.org
Cc: "ming.zym@gmail.com" <ming.zym@gmail.com>
Content-Type: multipart/alternative; boundary=00235447083074ddad04b8db6f41

--00235447083074ddad04b8db6f41
Content-Type: text/plain; charset=ISO-8859-1

AFAICT, the RAM cache is working as designed and it doesn't grow
"unbounded".   There may be issues with interactions in the OS and glibc,
but if there are, I would like to understand them.  Here is the last
message on the bug:

> the memory waste in TS is mostly because the Ramcache:
> 1, it is counted by really used memory.

Yes, that is how it is designed.  What else would you like it to do?
 Perhaps the OS memory allocation overhead?  If so, please confirm.

> 2, it will hold the whole block of memory from free to OS, the old glibc
memory management issue.

Not clear what this means.  Yes, it holds the block of memory given by the
OS (when malloc is called), or from the IOBufferBlock freelists.  What else
might it do?

> 3, ramcache use the cache reader buffer, and the buffer is allocated from
anywhere in the whole memory adress

Yes, memory is allocated across the address space.  The OS does fragment
the virtual memory space to some extent when large blocks of memory are
allocated (mmap(0)).  That is normal.  What are you suggesting is wrong
here?

> all those make TS waste much more memory than Ramcache is configured. it
will eat all you memory at then end.

You haven't pointed out any problems.  If there is a problem with the glibc
memory allocator we can move to tcmalloc or any other allocator.  Some
suggested solution here would be nice along with some pointer (e.g. URL) to
a discussion of the OS/glibc problem you are referring to.

> what we want to do:
> 1, limit ramcache memory by allocator

Not clear what this means.  Do you want to restrict the number of say 32K
objects which can be stored in the RAM cache?  This is likely to interact
badly with the LRU nature of the algorithm.

> 2, bound ram lur list and freelist

The LRU list is bounded by number of bytes in the objects it contains.  Are
you suggesting that you are getting a flurry of very small objects which is
causing the number of entries to jump and then that memory is not free'd
back to the OS?  What use pattern would cause this?  Is this a real problem?

> 3, freelist will be split by size

which freelist are we referring to here?  the entries freelist, the
iobuffer freelists are already split by size.  What reason could there be
to split the entry freelist by size?  in RamCacheCLFUS::victimize() the
data field containing the smart pointer to the memory is cleared, at which
point any interest in the size of the data associated with the entry should
have ended.

> 4, split ramcache memory and cache memory

The ability to share memory between the ram cache and the iobuffers used to
deliver content permits zero-copy RAM cache hits.  What would you gain by
throwing that away?

Some clarification would be great.  Clearly memory fragments in a large
system, but I don't see any documentation of those effects or the extent of
the problem attached to this bug.

> the codes will be ready in days

I am curious to see what code this might be.

On Mon, Feb 13, 2012 at 7:41 AM, Leif Hedstrom <zwoop@apache.org> wrote:

> On 2/13/12 2:02 AM, ming.zym@gmail.com wrote:
>
>> that looks like you are in the issue of TS-1006:
>> https://issues.apache.org/**jira/browse/TS-1006<https://issues.apache.org/jira/browse/TS-1006>
>>
>
>
> Hmmm, so, I personally never see this unbounded growth myself. What is it
> that makes it grow out of bounds like this? It can't just be the RAM cache,
> is it? Is there a bug in how it calculates memory usage vs what is
> allocated for it?
>
> There is no surprise that memory isn't free, the freelist doesn't work
> like that (obviously). Meaning, if the system at some point needed 1GB of
> RAM, and it's now on the freelist, it would never go below 1GB of RAM
> usage. The questions are:
>
> 1) Is there a leak in the RAM cache, or some other bug that prevents it
> from limiting the memory usage as per the records.config settings?
>
> 2) Or is there a "leak" in the freelist, where objects that are put on the
> freelist are not reused, and instead, we allocate new ones?
>
>
> I think we have to try to understand why this is happening, what sort of
> bug it is, and how it happens. Blindly freeing things from the freelist
> doesn't seem right to me (it's goes against the design and purpose of it).
> Hence, understanding why the freelist is allowed to grow out of bounds is
> the first step.
>
> I do agree that it could be useful to have an (optional) feature where we
> periodically reduce the various freelists that we have, so that we can
> reclaim some memory from extreme "spikes" in usage. But that can't be the
> solution to this problem I don't think?
>
> -- Leif
>
>

--00235447083074ddad04b8db6f41
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<div>AFAICT, the RAM cache is working as designed and it doesn&#39;t grow &=
quot;unbounded&quot;. =A0 There may be issues with interactions in the OS a=
nd glibc, but if there are, I would like to understand them. =A0Here is the=
 last message on the bug:</div>
<div><br></div><div>&gt; the memory waste in TS is mostly because the Ramca=
che:</div><div>&gt; 1, it is counted by really used memory.</div><div><br><=
/div><div>Yes, that is how it is designed. =A0What else would you like it t=
o do? =A0Perhaps the OS memory allocation overhead? =A0If so, please confir=
m.</div>
<div><br></div><div>&gt; 2, it will hold the whole block of memory from fre=
e to OS, the old glibc memory management issue.</div><div><br></div><div>No=
t clear what this means. =A0Yes, it holds the block of memory given by the =
OS (when malloc is called), or from the IOBufferBlock freelists. =A0What el=
se might it do?</div>
<div><br></div><div>&gt; 3, ramcache use the cache reader buffer, and the b=
uffer is allocated from anywhere in the whole memory adress</div><div><br><=
/div><div>Yes, memory is allocated across the address space. =A0The OS does=
 fragment the virtual memory space to some extent when large blocks of memo=
ry are allocated (mmap(0)). =A0That is normal. =A0What are you suggesting i=
s wrong here?</div>
<div><br></div><div>&gt; all those make TS waste much more memory than Ramc=
ache is configured. it will eat all you memory at then end.</div><div><br><=
/div><div>You haven&#39;t pointed out any problems. =A0If there is a proble=
m with the glibc memory allocator we can move to tcmalloc or any other allo=
cator. =A0Some suggested solution here would be nice along with some pointe=
r (e.g. URL) to a discussion of the OS/glibc problem you are referring to.<=
/div>
<div><br></div><div>&gt; what we want to do:</div><div>&gt; 1, limit ramcac=
he memory by allocator</div><div><br></div><div>Not clear what this means. =
=A0Do you want to restrict the number of say 32K objects which can be store=
d in the RAM cache? =A0This is likely to interact badly with the LRU nature=
 of the algorithm.</div>
<div><br></div><div>&gt; 2, bound ram lur list and freelist</div><div><br><=
/div><div>The LRU list is bounded by number of bytes in the objects it cont=
ains. =A0Are you suggesting that you are getting a flurry of very small obj=
ects which is causing the number of entries to jump and then that memory is=
 not free&#39;d back to the OS? =A0What use pattern would cause this? =A0Is=
 this a real problem?</div>
<div><br></div><div>&gt; 3, freelist will be split by size</div><div><br></=
div><div>which freelist are we referring to here? =A0the entries freelist, =
the iobuffer freelists are already split by size. =A0What reason could ther=
e be to split the entry freelist by size? =A0in RamCacheCLFUS::victimize() =
the data field containing the smart pointer to the memory is cleared, at wh=
ich point any interest in the size of the data associated with the entry sh=
ould have ended.</div>
<div><br></div><div>&gt; 4, split ramcache memory and cache memory</div><di=
v><br></div><div>The ability to share memory between the ram cache and the =
iobuffers used to deliver content permits zero-copy RAM cache hits. =A0What=
 would you gain by throwing that away?</div>
<div><br></div><div>Some clarification would be great. =A0Clearly memory fr=
agments in a large system, but I don&#39;t see any documentation of those e=
ffects or the extent of the problem attached to this bug.</div><div><br></d=
iv>
<div>&gt; the codes will be ready in days</div><div><br></div><div>I am cur=
ious to see what code this might be.</div><div><br></div><div class=3D"gmai=
l_quote">On Mon, Feb 13, 2012 at 7:41 AM, Leif Hedstrom <span dir=3D"ltr">&=
lt;<a href=3D"mailto:zwoop@apache.org">zwoop@apache.org</a>&gt;</span> wrot=
e:<br>
<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex"><div class=3D"im">On 2/13/12 2:02 AM, <a hre=
f=3D"mailto:ming.zym@gmail.com" target=3D"_blank">ming.zym@gmail.com</a> wr=
ote:<br>

<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex">
that looks like you are in the issue of TS-1006:<br>
<a href=3D"https://issues.apache.org/jira/browse/TS-1006" target=3D"_blank"=
>https://issues.apache.org/<u></u>jira/browse/TS-1006</a><br>
</blockquote>
<br>
<br></div>
Hmmm, so, I personally never see this unbounded growth myself. What is it t=
hat makes it grow out of bounds like this? It can&#39;t just be the RAM cac=
he, is it? Is there a bug in how it calculates memory usage vs what is allo=
cated for it?<br>

<br>
There is no surprise that memory isn&#39;t free, the freelist doesn&#39;t w=
ork like that (obviously). Meaning, if the system at some point needed 1GB =
of RAM, and it&#39;s now on the freelist, it would never go below 1GB of RA=
M usage. The questions are:<br>

<br>
1) Is there a leak in the RAM cache, or some other bug that prevents it fro=
m limiting the memory usage as per the records.config settings?<br>
<br>
2) Or is there a &quot;leak&quot; in the freelist, where objects that are p=
ut on the freelist are not reused, and instead, we allocate new ones?<br>
<br>
<br>
I think we have to try to understand why this is happening, what sort of bu=
g it is, and how it happens. Blindly freeing things from the freelist doesn=
&#39;t seem right to me (it&#39;s goes against the design and purpose of it=
). Hence, understanding why the freelist is allowed to grow out of bounds i=
s the first step.<br>

<br>
I do agree that it could be useful to have an (optional) feature where we p=
eriodically reduce the various freelists that we have, so that we can recla=
im some memory from extreme &quot;spikes&quot; in usage. But that can&#39;t=
 be the solution to this problem I don&#39;t think?<span class=3D"HOEnZb"><=
font color=3D"#888888"><br>

<br>
-- Leif<br>
<br>
</font></span></blockquote></div><br>

--00235447083074ddad04b8db6f41--