trafficserver-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chou, Peter" <>
Subject RE: Issue #2416 - Log Collation Memory Leak.
Date Tue, 31 Jul 2018 06:56:18 GMT
Hi All,

I think that this is the second memory leak that I mentioned below. It does not have to do
with the send-queue-full message, but only with simply configuring more than one active host.
For example, if you configure three hosts such as “{ host1, host2, host3 }”, then the
following leak will happen --

LogHostList::preproc_and_try_delete(LogBuffer *lb)
  ink_release_assert(lb->m_references == 0);                                 m_references
starts as zero

  nr_host = nr = count();
  ink_atomic_increment(&lb->m_references, nr_host);                          increments
the m_references by 3 _not_ by 1 

  for (LogHost *host = first(); host && nr; host = next(host)) {
    LogHost *lh    = host;
    available_host = lh;

    do {
      ink_atomic_increment(&lb->m_references, 1);                            increments
the m_references by 1 for each host tried
      success     = lh->preproc_and_try_delete(lb);                            LogHost::preproc_and_try_delete()
will consume 1 ref-count
      need_orphan = need_orphan && (success == false);
    } while (lb && (success == false) && (lh = lh->;

  if (lb != nullptr) {
    LogBuffer::destroy(lb);                                                  this only
consumes 1 ref-count 
  return 0;

So since the ink_atomic_increment() and LogBuffer::destroy() are not balanced this will produce
a memory leak. In my test scenario, at the end of 100K transactions, ATS had allocated 5961
and only freed 4348 LogBuffers. Proposed fix is to change ink_atomic_increment(&lb->m_references,
nr_host) to ink_atomic_increment(&lb->m_references, 1) to balance things. Appreciate
any second opinions, and I opened a PR #4041 with both memory leaks.


-----Original Message-----
From: Chou, Peter 
Sent: Thursday, July 26, 2018 1:42 PM
To: dev <>
Subject: RE: Issue #2416 - Log Collation Memory Leak.


1) No problem. It is a two line change so we can carry in our custom 7.1.x if necessary.

2) Yeah this feature has issues. After fixing this memory leak, we realized that there is
ANOTHER leak if you over-run the log collation transfer capacity triggering the "send-queue
full" message. Perhaps this can be mitigated by increasing the send queue size (collation_max_send_buffers).
This second issue can be triggered by increased transaction rate and/or configuring multiple
active hosts, i.e., { host1, host2 } rather than active-standby { { host1 , host2 } }, to
over-load the log sending capacity. I will take your proposal back to discuss with my users.


-----Original Message-----
From: Leif Hedstrom [] 
Sent: Thursday, July 26, 2018 12:54 PM
To: dev <>
Subject: Re: Issue #2416 - Log Collation Memory Leak.

> On Jul 26, 2018, at 3:14 AM, Chou, Peter <> wrote:
> Hi All,
> After some additional study on how the ink_atomic_increment() and LogBuffer::destroy()
work together in the LogFile::, LogHost::, and LogHostList::preproc_and_try_delete() functions,
the following is probably a better patch to address this issue. Appreciate any second opinions
on this before I open a PR.

Nice find.

However, I have two comments on this:

1) I’d really prefer not to respin the 7.1.x release over this. Most people don’t use
this (IMO broken) feature. I hope that’s ok.

2) I’d like to restart the discussions of removing this feature from v9.0.0 again (in fact,
I kinda wish we could nuke its from 8.0.0, but a little late for that).

For #2, I feel that there are a lot better tools out there, like Kafka, Elastic Search (or
Splunk for a commercial solution).

I’ll make a separate thread on asking for the deprecation and removal of this feature :-)


— Leif

View raw message