commons-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Gamache <cgama...@gmail.com>
Subject Re: Help needed: commons-exec CLOSE_WAIT problem
Date Wed, 27 Jan 2016 13:43:32 GMT
I really appreciate the thorough inspection you have given. With your help
I think I've targeted the real problem...

GhostDriver can't exist outside of phantomjs. GhostDriver is an
implementation of WebDriver Wire Protocol which runs on phantomjs's
JavaScript engine. It can't run without phantomjs.

We can rest easy that commons-exec is not holding open the file handle.
That's good. I think the PhantomJSDriver is firing a copy of phantomjs up
using commons-exec, but it's not communicating with the service through
commons-exec. Upon closer inspection selenium-java is communicating through
tcp/ip to a WebDriver server phantomjs is popping up on random available
ports. This happens to be an instance of some class that implements
org.apache.http.client.HttpClient, by default it looks to
be org.apache.http.impl.client.CloseableHttpClient ... I think that when
commons-exec terminates phantomjs, thereby terminating the GhostDriver
server, the HttpClient is still attached to that port.

Here's my take on the real problem:

I think the selenium folks didn't consider port closing
(org.apache.http.client.HttpClient doesn't have a close method to
implement) a necessary step and they figured that if whatever service the
HttpClient was attached to shut itself down that the port would
automatically close itself down also. The standard use for selenium usually
doesn't require it to run for days and days on end. I can't fault them for
missing it.

That being said, they made this difficult to override in consumer classes
by keeping a lot of the guts of the connections hidden behind private
classes and private member data.

So, I'll need to dig into selenium-java, change a few interfaces to make
sure that whatever HttpClient is used will have a close() hook, and then
make sure that whatever is driving the HttpClient has access to the close
hook on shutdown, and fires it. When the HttpClient is closed, the file
descriptor should disappear.

Thank you again for your help. It was indispensable!

CG


On Tue, Jan 26, 2016 at 4:07 PM, Siegfried Göschl <
siegfried.goeschl@it20one.com> wrote:

> Hi Chris,
>
> I played with your GitHub repo [
> https://github.com/cgamache/openfile](https://github.com/cgamache/openfile)
>
> ### 1. First Take
>
> Your test program actually starts a JVM and the Selenium library start
> multiple "phantomjs" executable to run the test as shown below
>
> ```
> application> ps -ef | grep phantom
>   501  1021   460   0  9:26PM ttys000    0:00.71 java -jar openfile.jar
> /usr/local/Cellar/phantomjs/2.1.0/bin/phantomjs
>   501  1073  1062   0  9:26PM ttys000    0:00.08
> /usr/local/Cellar/phantomjs/2.1.0/bin/phantomjs --webdriver=36418
> --webdriver-logfile=/Users/sgoeschl/work/github/cgamache/openfile/target/phantomjsdriver.log
> ```
>
> So commons-exec is used under the hood to run the "phantomjs" binary and
> make sure nothing hangs (output & error thread pumping, watchdog, ..)
>
> When the "WebDriver" has finished its work there
>
> * is no ***phantomjs*** process running
> * is only your program running as shown below
>
> ```
> application> ps -ef | grep phantom
>   501  1021   460   0  9:23PM ttys000    0:00.72 java -jar openfile.jar
> /usr/local/Cellar/phantomjs/2.1.0/bin/phantomjs
>   501  1034   469   0  9:23PM ttys001    0:00.00 grep phantom
> ```
>
> And as described we have a couple of ports in CLOSE_WAIT stemming from
> your program
>
> ```
> application> lsof -p 1021 | grep CLOSE_WAIT
> java    1021 sgoeschl   16u    IPv6 0x78418468470e25c7       0t0     TCP
> localhost:50581->localhost:40560 (CLOSE_WAIT)
> java    1021 sgoeschl   18u    IPv6 0x784184685efd95a7       0t0     TCP
> localhost:50599->localhost:45319 (CLOSE_WAIT)
> java    1021 sgoeschl   19u    IPv6 0x784184685efdc0a7       0t0     TCP
> localhost:50620->localhost:17950 (CLOSE_WAIT)
> java    1021 sgoeschl   20u    IPv6 0x78418468470e3087       0t0     TCP
> localhost:50641->localhost:41916 (CLOSE_WAIT)
> java    1021 sgoeschl   21u    IPv6 0x78418468470e4b67       0t0     TCP
> localhost:50661->localhost:25113 (CLOSE_WAIT)
> ```
>
> As far as common-exec is concerned everything is fine - the
> ***phantomjs*** processes where nicely executed and no process is left
>
> So you have 5 ports in limbo which exactly corresponds to the number of
> created ***phantom.js*** instances
>
> ```
> INFO: executable: /usr/local/Cellar/phantomjs/2.1.0/bin/phantomjs
> INFO: executable: /usr/local/Cellar/phantomjs/2.1.0/bin/phantomjs
> INFO: executable: /usr/local/Cellar/phantomjs/2.1.0/bin/phantomjs
> INFO: executable: /usr/local/Cellar/phantomjs/2.1.0/bin/phantomjs
> INFO: executable: /usr/local/Cellar/phantomjs/2.1.0/bin/phantomjs
> ```
>
> This nicely corresponds to your test code
>
> ```
> public static final int REPS = 5;
>
> for (; i < REPS; i++) {
>     WebDriver driver = new PhantomJSDriver(caps);
>     driver.get("http://www.apache.org");
>     driver.quit();
> }
> ```
>
> In other words each ***phantomjs*** process leaves exactly on port behind
> :-)
>
>
> ### 2. Second Take
>
> So where is this port coming from? The next session - starting the test
> programm leaves the 5 ports behind
>
> ```
> application> lsof -p 1329 | grep CLOSE_WAIT
> java    1329 sgoeschl   16u    IPv6 0x784184686037bb07       0t0     TCP
> localhost:51358->localhost:30177 (CLOSE_WAIT)
> java    1329 sgoeschl   18u    IPv6 0x784184685f5c3b47       0t0     TCP
> localhost:51379->localhost:cadkey-tablet (CLOSE_WAIT)
> java    1329 sgoeschl   19u    IPv6 0x784184686037f0c7       0t0     TCP
> localhost:51400->localhost:43648 (CLOSE_WAIT)
> java    1329 sgoeschl   20u    IPv6 0x784184686037c5c7       0t0     TCP
> localhost:51418->localhost:34505 (CLOSE_WAIT)
> java    1329 sgoeschl   21u    IPv6 0x78418468470e1047       0t0     TCP
> localhost:51438->localhost:30905 (CLOSE_WAIT)
> ```
>
> All of ports are actually found in phantomjsdriver.log (cadkey-tablet is
> actually port 1400)
>
> ```
> [INFO  - 2016-01-26T20:51:19.697Z] GhostDriver - Main - running on port
> 30177
> [INFO  - 2016-01-26T20:51:21.685Z] GhostDriver - Main - running on port
> 1400
> [INFO  - 2016-01-26T20:51:23.427Z] GhostDriver - Main - running on port
> 43648
> [INFO  - 2016-01-26T20:51:25.156Z] GhostDriver - Main - running on port
> 34505
> [INFO  - 2016-01-26T20:51:26.887Z] GhostDriver - Main - running on port
> 30905
> ```
>
> So the ports are actually used/opened for ***GhostDriver*** and left there
> even when ***driver.quit()*** was called
>
>
> ### 3. Conclusion
>
> * ASFAIK it is not commons-exec issue since all ***phantomjs*** were
> cleanly terminated
> * Each ***PhantomJSDriver*** invocation leaves a port in ***CLOSE_WAIT***
> behind
> * Those ports are linked to ***GhostDriver - Main - running on port
> 30177***
>
>
> Cheers,
>
> Siegfried Goeschl
>
>
>
> ----- Ursprüngliche Mail -----
> Von: "Chris Gamache" <cgamache@gmail.com>
> An: "Commons Users List" <user@commons.apache.org>
> Gesendet: Dienstag, 26. Januar 2016 04:02:41
> Betreff: Re: Help needed: commons-exec CLOSE_WAIT problem
>
> Most defnintely. This was tested on OSX, but should work properly on Linux
> also. You'll need to install phantomjs.
>
> You can clone this repo (https://github.com/cgamache/openfile) and build
> with Maven. The com.codeborne:phantomjsdriver artifact has all of selenium
> packaged inside.
>
> To run,
>
> $ java -jar openfile.jar /path/to/phantomjs
>
> To examine,
>
> $ lsof -p 38270 | grep CLOSE_WAIT | wc -l
>
> Jan 25, 2016 9:46:17 PM
> org.openqa.selenium.phantomjs.PhantomJSDriverService <init>
>
> ....... super chatty........
>
> [INFO  - 2016-01-26T02:46:41.169Z] ShutdownReqHand - _handle - About to
> shutdown
>
>
> Sleeping for 30 seconds so you can examine file handles... you should see 5
>
> Then you can use the lsof command at the top of the output, complete with
> process ID, to check the CLOSE_WAIT file handles.
>
> Thanks for taking a look!
>
> CG
>
>
>
> On Mon, Jan 25, 2016 at 5:28 PM, Siegfried Göschl <
> siegfried.goeschl@it20one.com> wrote:
>
> > Hi Chris,
> >
> > there could be couple of reasons for this behaviour - is there a minimal
> > setup to reproduce the problem?
> >
> > Thanks in advance
> >
> > Siegfried Goeschl
> >
> > ----- Ursprüngliche Mail -----
> > Von: "Chris Gamache" <cgamache@gmail.com>
> > An: user@commons.apache.org
> > Gesendet: Montag, 25. Januar 2016 22:34:34
> > Betreff: Help needed: commons-exec CLOSE_WAIT problem
> >
> > Hi commons-exec folks,
> >
> > Hoping you can help me figure this out. Selenium Java uses commons-exec
> 1.3
> > under the hood to communicate with phantomjs. When it fires up it opens
> up
> > a pipe that you can see with lsof:
> >
> > # lsof -p 19947
> >
> > ...
> >  java    19947 user   64     PIPE 0xd2cd00ccca85f9d     16384
> > ->0xd2cd00ca9fbbf9d
> >  java    19947 user   66     PIPE 0xd2cd00ca31d445d     16384
> > ->0xd2cd00ca31d4c9d
> >
> > Then we get this in lsof as selenium is driving phantomjs:
> >
> >  java    19947 user   62u    IPv6 0xd2cd00cc904879d       0t0      TCP
> > localhost:49757->localhost:23795 (ESTABLISHED)
> >
> > Then after selenium closes and terminates the executor-- properly, as I
> > observed by stepping through the code as it executes, but maybe someone
> > knows otherwise-- we can see in lsof:
> >
> >  java    19947 user  62u    IPv6 0xd2cd00cc904879d       0t0      TCP
> > localhost:49757->localhost:23795 (CLOSE_WAIT)
> >
> > And each successive web driver instance that gets created cause those to
> > build up and build up and build up until you run out of file handles.
> >
> > ... So are there any considerations for using commons-exec that the
> > selenium folks might not be addressing during process destruction which
> > might manifest themselves in these file handles just hanging out, taking
> up
> > space? This can't be the correct/uncorrectable behavior.
> >
> > I happen to be using Selenium Java 2.49.1 which is the latest version as
> of
> > this moment, and Java 8. It seems like this has been broken for quite
> some
> > time though -- https://github.com/SeleniumHQ/selenium/issues/1080
> >
> > Please advise! Thanks!
> >
> > CG
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
> > For additional commands, e-mail: user-help@commons.apache.org
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@commons.apache.org
> For additional commands, e-mail: user-help@commons.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message