stratos-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shaheedur Haque (shahhaqu)" <shahh...@cisco.com>
Subject RE: File handle leak in Thrift
Date Mon, 23 Nov 2015 18:11:06 GMT
It seems the upstream fix is in Thrift 0.9.3. Now, I think I pasted the wrong dependency in
the email below, but changing the variable "thrift.version" to 0.9.3 simply resulted in a
build failure:

[ERROR] Failed to execute goal on project org.apache.stratos.common: Could not resolve dependencies
for project org.apache.stratos:org.apache.stratos.common:bundle:4.1.0: Could not find artifact
org.wso2.carbon:org.wso2.carbon.databridge.agent.thrift:jar:0.9.3 in central (http://repo1.maven.org/maven2)
-> [Help 1]

I'm not sure (a) if I got the right variable, and if I did (b) why it did not work. How else
do I get the fix?

From: Shaheedur Haque (shahhaqu)
Sent: 23 November 2015 13:46
To: dev@stratos.incubator.apache.org
Cc: Martin Eppel (meppel); Ali Bidabadi (abidabad)
Subject: File handle leak in Thrift

Hi all,

I believe that Stratos is missing a memory leak fix in libthrift_0.7.0.wso2v2.jar as follows...


1.     For unknown reasons, we sometimes get Stratos' memory footprint growing from the normal
"1.0something" GB of virtual memory to 10 GB and then 34 GB in a matter of seconds:



top - 21:21:55 up  4:39,  1 user,  load average: 0.01, 0.08, 0.18
Tasks: 135 total,   3 running, 132 sleeping,   0 stopped,   0 zombie
%Cpu(s):  2.5 us,  0.9 sy,  0.0 ni, 95.3 id,  1.1 wa,  0.0 hi,  0.1 si,  0.0 st
KiB Mem:  16434456 total,  9867416 used,  6567040 free,    95772 buffers
KiB Swap:        0 total,        0 used,        0 free.  6485696 cached Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
 2913 netiq     20   0 4702012 1.083g  18976 S   2.9  6.9   1:28.04 java
 2741 root      20   0 3395640 600544  14504 S   1.8  3.7   0:34.09 java
25941 root      20   0  186084  37700  26636 S   0.7  0.2   1:48.86 corosync
...



top - 21:23:55 up  4:41,  1 user,  load average: 1.08, 0.55, 0.35
Tasks: 137 total,   3 running, 134 sleeping,   0 stopped,   0 zombie
%Cpu(s): 34.1 us, 10.8 sy,  0.0 ni, 45.9 id,  0.9 wa,  0.0 hi,  8.3 si,  0.0 st
KiB Mem:  16434456 total, 10957936 used,  5476520 free,    96024 buffers
KiB Swap:        0 total,        0 used,        0 free.  6599088 cached Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
 2913 netiq     20   0 10.236g 1.411g  18956 S  91.0  9.0   3:17.37 java
 2741 root      20   0 3395776 621352  14520 S  12.3  3.8   0:48.84 java
25941 root      20   0  186084  37700  26636 S   0.7  0.2   1:49.68 corosync
...



2.     The logs fill very rapidly at this point, so all we see is that after the fact, all
10 GB of logs look like this:

TID: [0] [STRATOS] [2015-11-22 21:27:22,795]  WARN {org.apache.thrift.server.TThreadPoolServer}
-  Transport error occurred during acceptance of message.
org.apache.thrift.transport.TTransportException: java.net.SocketException: Too many open files
        at org.apache.thrift.transport.TServerSocket.acceptImpl(TServerSocket.java:118)
        at org.apache.thrift.transport.TServerSocket.acceptImpl(TServerSocket.java:35)
        at org.apache.thrift.transport.TServerTransport.accept(TServerTransport.java:31)
        at org.apache.thrift.server.TThreadPoolServer.serve(TThreadPoolServer.java:106)
        at org.wso2.carbon.databridge.receiver.thrift.internal.ThriftDataReceiver$ServerThread.run(ThriftDataReceiver.java:199)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.SocketException: Too many open files
        at java.net.PlainSocketImpl.socketAccept(Native Method)
        at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398)
        at java.net.ServerSocket.implAccept(ServerSocket.java:530)
        at java.net.ServerSocket.accept(ServerSocket.java:498)
        at org.apache.thrift.transport.TServerSocket.acceptImpl(TServerSocket.java:113)
        ... 5 more

Now a cursory glance at upstream shows this was probably fixed upstream in 2015:



https://git-wip-us.apache.org/repos/asf?p=thrift.git;a=commitdiff;h=b1a35da9168cca5a7524ab9814161f024da145df

and given that our 0.7.0 jar file has content dated 2011, it likely does not have the fix.
I also note that upstream has evolved considerably overall. Now, what I am not sure of is
whether we are using an old library for some specific reason, e.g. was it hacked/modified
by wso2? Is the new code not compatible with the Stratos codebase? If I am looking in the
right place, the stratos/components/org.apache.stratos.common/pom.xml seems to be picking
up a specific version:

    <dependency>
            <groupId>org.wso2.carbon</groupId>
            <artifactId>org.wso2.carbon.databridge.agent.thrift</artifactId>
            <version>${wso2carbon.version}</version>
        </dependency>

Do we know why? How to go about getting the fix? Please advise,

Thanks, Shaheed

Mime
View raw message