couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Anderson <jch...@apache.org>
Subject Re: Query server perfromance issues ..
Date Tue, 22 Sep 2009 17:28:10 GMT
On Tue, Sep 22, 2009 at 10:11 AM, Debasish Ghosh
<ghosh.debasish@gmail.com> wrote:
>> It may be that we're flushing the socket with no data, and the Scala
>> server is interpreting that as null input. The JS client uses
>> readline() implemented in C, so it shouldn't have access to data until
>> a line break has been sent by CouchDB.
>
> readLine blocks .. right .. and only comes out with the null input.
> The question is how it gets this null string with the new version of
> CouchDB.
> Is there something different that you were doing in earlier versions.
> Just wondering how it still runs with an earlier snapshot of CouchDB
> ..
>

I still think we'd learn a lot with a little shell wrapper, maybe the
unix tee command, so we can log both input and output streams.

in answer to your question about what has changed in the
implementation -- a whole lot, but nothing looks like it'd be causing
this.

> Thanks.
> - Debasish
>
> On Mon, Sep 21, 2009 at 6:07 PM, Debasish Ghosh
> <ghosh.debasish@gmail.com> wrote:
>>
>> The actual code is something like this ..
>> var s = isr.readLine
>> while (s != null) {
>>     // do stuff
>>     s = isr.readLine
>> }
>> I wrote the other version just to log what I get back. Now this same version works
ok with the earlier version of the couchdb server. That's what beats me here ..
>> Thanks.
>> - Debasish
>>
>> On Mon, Sep 21, 2009 at 5:46 PM, Robert Newson <robert.newson@gmail.com> wrote:
>>>
>>> I claim you are ignoring null here because of your comment;
>>>
>>> while (true) {
>>>  s = inputstreamreader.readLine
>>>  if (s == null) // ignore
>>>  else
>>>  toJson(s) match {
>>>   //.. process reset, add_fun etc.
>>>  }
>>> }
>>>
>>> When System.in is closed this loop will spin; readLine() will always
>>> return null. Since System.in is only closed when the JVM is exiting,
>>> it is never correct to ignore it and continue processing.
>>>
>>> The loop I presented is not the same as yours as mine will correctly
>>> exit on process termination.
>>>
>>> readLine() *cannot* return null under any circumstance but the close
>>> of the stream (couchdb cannot pass you null this way). System.in is
>>> never closed unless the process itself is exiting, and it is never
>>> reopened.
>>>
>>> The mishandling of readLine() is probably hiding the real problem. I
>>> would guess you pass invalid JSON to couchdb, or fail to return
>>> anything at all, under some conditions. Couch then kills your view
>>> server (and would then restart it). The view server, rather than
>>> gracefully exiting when this happens, will simple spin, never exiting.
>>>
>>> B.
>>>
>>> On Mon, Sep 21, 2009 at 8:19 AM, Debasish Ghosh
>>> <ghosh.debasish@gmail.com> wrote:
>>> > It's in fact referring to a reader that wraps System.in.
>>> > readLine returns null on end of file, but the earlier version of the
>>> > snapshot handles it and does not close the query server process. While the
>>> > new server seems to get throttled in the while loop. In fact this is one
>>> > difference that I forgot to mention. In the earlier version the query server
>>> > does not close, while in the new version it gets closed and restarted for
>>> > every view operation. Maybe it's getting closed because of the null. I can
>>> > figure that out from the logs. Is this an intentional change in
>>> > implementation ?
>>> > Robert -
>>> > I am not ignoring null. The while loop is very similar to what u mention.
I
>>> > switched to the while true version just to log and see if nulls are getting
>>> > returned.
>>> > Thanks.
>>> > - Debasish
>>> >
>>> > On Mon, Sep 21, 2009 at 3:53 AM, Paul Davis <paul.joseph.davis@gmail.com>
>>> > wrote:
>>> >>
>>> >> On Sun, Sep 20, 2009 at 1:34 AM, Debasish Ghosh
>>> >> <ghosh.debasish@gmail.com> wrote:
>>> >> > Chris -
>>> >> > In my query server code, I logged everything that gets exchanged
between
>>> >> > the
>>> >> > couchdb server process and the query server. The difference that
I
>>> >> > noticed
>>> >> > with the new changes are that the couchdb server sends a huge number
of
>>> >> > null
>>> >> > strings to the view server which chokes the latter. In the snippet
that
>>> >> > I
>>> >> > wrote before ..
>>> >> >
>>> >> > while (true) {
>>> >> >>> >  s = inputstreamreader.readLine  // this reads from
stdin
>>> >> >>> >  if (s == null) // ignore
>>> >> >>> >  else
>>> >> >>> >  toJson(s) match {
>>> >> >>> >    //.. process reset, add_fun etc.
>>> >> >>> >  }
>>> >> >>> > }
>>> >> >
>>> >>
>>> >> Does inputstreamreader.readLine refer to this function:
>>> >>
>>> >>
>>> >> http://java.sun.com/j2se/1.5.0/docs/api/java/io/BufferedReader.html#readLine%28%29
>>> >>
>>> >> If so, and that's returning null, then is it signaling that CouchDB
>>> >> has tried to close the input stream?
>>> >>
>>> >> Paul
>>> >>
>>> >> > I put logs in the true branch of if (s == null) and moments later
I
>>> >> > found a
>>> >> > log created of size 10 MB where the view server gets null strings
from
>>> >> > stdin. This may give some clues towards the problem.
>>> >> >
>>> >> > Hope this helps.
>>> >> > - Debasish
>>> >> >
>>> >> >
>>> >> >
>>> >> >
>>> >> >
>>> >> > On Sun, Sep 20, 2009 at 10:56 AM, Chris Anderson <jchris@apache.org>
>>> >> > wrote:
>>> >> >
>>> >> >> On Sat, Sep 19, 2009 at 10:09 PM, Debasish Ghosh
>>> >> >> <ghosh.debasish@gmail.com> wrote:
>>> >> >> > Yes, actually the reason I brought it up is that the same
query
>>> >> >> > server
>>> >> >> runs
>>> >> >> > fine with the earlier version, while it stumbles with
the changes
>>> >> >> > incorporated later. Actually there is a really really
big difference
>>> >> >> > in
>>> >> >> > performance which is primarily because of the timeouts.
Thanks for
>>> >> >> deciding
>>> >> >> > to look into it. I will currently stick around with the
April
>>> >> >> > snapshot.Please post your findings on this list - I will
be happy to
>>> >> >> upgrade
>>> >> >> > to the latest.
>>> >> >> > Thanks.
>>> >> >> > - Debasish
>>> >> >>
>>> >> >> I think what we'll need is a way to get visibility between
the beam
>>> >> >> process and the query server. this could be accomplished with
a simple
>>> >> >> log wrapper around the query server, logging both stdin and
stdout to
>>> >> >> individual files.
>>> >> >>
>>> >> >> I like the idea of implementing it as a wrapper because then
we can
>>> >> >> wrap it around the scala as well as the JS query server (and
other
>>> >> >> languages), and get complete transparency into what's going
over the
>>> >> >> wire.
>>> >> >>
>>> >> >> This is definitely turning into dev@ territory so I'm moving
this
>>> >> >> thread
>>> >> >> there.
>>> >> >>
>>> >> >> Chris
>>> >> >>
>>> >> >> >
>>> >> >> > On Sun, Sep 20, 2009 at 3:41 AM, Chris Anderson <jchris@apache.org>
>>> >> >> wrote:
>>> >> >> >
>>> >> >> >> On Sat, Sep 19, 2009 at 11:40 AM, Debasish Ghosh
>>> >> >> >> <ghosh.debasish@gmail.com> wrote:
>>> >> >> >> > Here are some additional behavior changes that
I am noticing
>>> >> >> >> > between
>>> >> >> the
>>> >> >> >> 2
>>> >> >> >> > versions ..
>>> >> >> >>
>>> >> >> >> The other big change is in couch_os_process, the addition
of
>>> >> >> >> couchspawnkillable - maybe this is acting up on your
system.
>>> >> >> >>
>>> >> >> >> Partially I'm interested in getting to the bottom
of this because it
>>> >> >> >> could be that it's inefficient with the JS query server,
but not
>>> >> >> >> causing errors, and we just haven't noticed.
>>> >> >> >>
>>> >> >> >> > In the newer version, I notice lots of null strings
being sent
>>> >> >> >> continuously
>>> >> >> >> > from the couchdb server to the view server. My
view server loop
>>> >> >> >> > looks
>>> >> >> >> like
>>> >> >> >> > the following :-
>>> >> >> >> >
>>> >> >> >> > while (true) {
>>> >> >> >> >  s = inputstreamreader.readLine
>>> >> >> >> >  toJson(s) match {
>>> >> >> >> >    //.. process reset, add_fun etc.
>>> >> >> >> >  }
>>> >> >> >> > }
>>> >> >> >> >
>>> >> >> >> > With the new version, I find lots of null strings
coming in to
>>> >> >> >> > "s",
>>> >> >> which
>>> >> >> >> > makes me include something like the following
..
>>> >> >> >> >
>>> >> >> >> > while (true) {
>>> >> >> >> >  s = inputstreamreader.readLine
>>> >> >> >> >  if (s == null) // ignore
>>> >> >> >> >  else
>>> >> >> >> >  toJson(s) match {
>>> >> >> >> >    //.. process reset, add_fun etc.
>>> >> >> >> >  }
>>> >> >> >> > }
>>> >> >> >> >
>>> >> >> >> > And this null business is really huge. Has there
been any change
>>> >> >> >> > in
>>> >> >> the
>>> >> >> >> > protocol between the couchdb server and the view
server ? I
>>> >> >> >> > suspect
>>> >> >> that
>>> >> >> >> > these null exchanges are taking up lots of cycles
which result in
>>> >> >> process
>>> >> >> >> > time out in the new version. I do not get this
null stuff with the
>>> >> >> older
>>> >> >> >> > version. Is there any chance of such happening
with the changes
>>> >> >> >> > that
>>> >> >> have
>>> >> >> >> > been done in couch_query_servers.erl ?
>>> >> >> >> >
>>> >> >> >> > Thanks.
>>> >> >> >> > - Debasish
>>> >> >> >> >
>>> >> >> >> >
>>> >> >> >> > On Sat, Sep 19, 2009 at 11:34 PM, Debasish Ghosh
>>> >> >> >> > <ghosh.debasish@gmail.com>wrote:
>>> >> >> >> >
>>> >> >> >> >> actually my ["reset"] is not expensive at
all .. it just has a
>>> >> >> >> array.clear
>>> >> >> >> >> kind of call.
>>> >> >> >> >> Just another observation when I run in debug
mode I find that
>>> >> >> >> >> there
>>> >> >> are
>>> >> >> >> >> quite a few cases of OS Process Error {os_process_error,
"OS
>>> >> >> >> >> process
>>> >> >> >> timed
>>> >> >> >> >> out."} being recorded in couch.log. I do
not get this when I am
>>> >> >> running
>>> >> >> >> the
>>> >> >> >> >> earlier version. However no unnatural things
appear in
>>> >> >> couchdb.stderr.
>>> >> >> >> My
>>> >> >> >> >> current setting of os_process_timeout is
20000 .. I guess that's
>>> >> >> >> >> 20
>>> >> >> secs
>>> >> >> >> ..
>>> >> >> >> >>
>>> >> >> >> >> Thanks.
>>> >> >> >> >> - Debasish
>>> >> >> >> >>
>>> >> >> >> >>
>>> >> >> >> >> On Sat, Sep 19, 2009 at 10:27 PM, Chris Anderson
>>> >> >> >> >> <jchris@apache.org
>>> >> >> >> >wrote:
>>> >> >> >> >>
>>> >> >> >> >>> On Sat, Sep 19, 2009 at 5:13 AM, Debasish
Ghosh
>>> >> >> >> >>> <ghosh.debasish@gmail.com> wrote:
>>> >> >> >> >>> > Hi -
>>> >> >> >> >>> > As I have mentioned previously I
have been working on a Scala
>>> >> >> driver
>>> >> >> >> for
>>> >> >> >> >>> > CouchDB, which also includes a Query
Server. I was working
>>> >> >> >> >>> > with an
>>> >> >> >> April
>>> >> >> >> >>> > snapshot of 2009/04/23. This worked
fine for all the views and
>>> >> >> >> >>> validations
>>> >> >> >> >>> > that I have written.Things were
running fine and I could write
>>> >> >> >> >>> map/reduce
>>> >> >> >> >>> > and validation functions in Scala.
>>> >> >> >> >>> > Recently I tried to upgrade to trunk.
Suddenly the views and
>>> >> >> >> validations
>>> >> >> >> >>> > became very very slow. After some
fact finding, I tried to
>>> >> >> >> >>> > poke
>>> >> >> into
>>> >> >> >> *
>>> >> >> >> >>> > couch_query_servers.erl*, since
that seemed to be the obvious
>>> >> >> >> >>> > area
>>> >> >> to
>>> >> >> >> >>> look
>>> >> >> >> >>> > into. I may be worng though, but
it was a blind guess.
>>> >> >> >> >>> > I noticed that previously I was
working with *revision 749852*
>>> >> >> >> >>> > of
>>> >> >> the
>>> >> >> >> >>> file,
>>> >> >> >> >>> > which delivered the goods for me.
Then when I faced problems
>>> >> >> >> >>> > with
>>> >> >> the
>>> >> >> >> >>> trunk,
>>> >> >> >> >>> > I started doing a git reset to earlier
versions of this file.
>>> >> >> >> >>> > Now
>>> >> >> I
>>> >> >> >> find
>>> >> >> >> >>> > that it looks like the performance
problem starts from
>>> >> >> >> >>> > *revision
>>> >> >> >> 780165*
>>> >> >> >> >>> of
>>> >> >> >> >>> > this file. Have a look at
>>> >> >> >> >>> >
>>> >> >> >> >>>
>>> >> >> >>
>>> >> >>
>>> >> >> http://svn.apache.org/viewvc/couchdb/trunk/src/couchdb/couch_query_servers.erl?r1=780165&r2=749852&diff_format=hfor
>>> >> >> >> >>> > the difference. Looks like there
have been some major changes.
>>> >> >> >> >>> > I
>>> >> >> am
>>> >> >> >> >>> > just
>>> >> >> >> >>> > wondering if this change has anything
to do with the
>>> >> >> >> >>> > performance
>>> >> >> >> issue.
>>> >> >> >> >>> >
>>> >> >> >> >>>
>>> >> >> >> >>> A quick scan of that diff suggests that
the only real behavior
>>> >> >> change
>>> >> >> >> >>> that should effect you is the ["reset"]
call for recycled
>>> >> >> >> >>> processes.
>>> >> >> >> >>> Maybe reset is expensive in your implementation?
>>> >> >> >> >>>
>>> >> >> >> >>> BTW, have you tried running:
>>> >> >> >> >>>
>>> >> >> >> >>> spec test/query_server_spec.rb -f specdoc
--color
>>> >> >> >> >>>
>>> >> >> >> >>> It should be simple to extend that test
suite to test your scala
>>> >> >> >> >>> server. If there are patches we can make
to make it easier to
>>> >> >> >> >>> integrate outside projects with the query
server test suite, I'm
>>> >> >> happy
>>> >> >> >> >>> to help there as well.
>>> >> >> >> >>>
>>> >> >> >> >>> > Any help, pointer will be appreciated.
>>> >> >> >> >>> >
>>> >> >> >> >>> > Thanks.
>>> >> >> >> >>> > - Debasish
>>> >> >> >> >>> >
>>> >> >> >> >>>
>>> >> >> >> >>>
>>> >> >> >> >>>
>>> >> >> >> >>> --
>>> >> >> >> >>> Chris Anderson
>>> >> >> >> >>> http://jchrisa.net
>>> >> >> >> >>> http://couch.io
>>> >> >> >> >>>
>>> >> >> >> >>
>>> >> >> >> >>
>>> >> >> >> >
>>> >> >> >>
>>> >> >> >>
>>> >> >> >>
>>> >> >> >> --
>>> >> >> >> Chris Anderson
>>> >> >> >> http://jchrisa.net
>>> >> >> >> http://couch.io
>>> >> >> >>
>>> >> >> >
>>> >> >>
>>> >> >>
>>> >> >>
>>> >> >> --
>>> >> >> Chris Anderson
>>> >> >> http://jchrisa.net
>>> >> >> http://couch.io
>>> >> >>
>>> >> >
>>> >
>>> >
>>
>



-- 
Chris Anderson
http://jchrisa.net
http://couch.io

Mime
View raw message