couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Davis <paul.joseph.da...@gmail.com>
Subject Re: Query server perfromance issues ..
Date Sun, 20 Sep 2009 22:23:20 GMT
On Sun, Sep 20, 2009 at 1:34 AM, Debasish Ghosh
<ghosh.debasish@gmail.com> wrote:
> Chris -
> In my query server code, I logged everything that gets exchanged between the
> couchdb server process and the query server. The difference that I noticed
> with the new changes are that the couchdb server sends a huge number of null
> strings to the view server which chokes the latter. In the snippet that I
> wrote before ..
>
> while (true) {
>>> >  s = inputstreamreader.readLine  // this reads from stdin
>>> >  if (s == null) // ignore
>>> >  else
>>> >  toJson(s) match {
>>> >    //.. process reset, add_fun etc.
>>> >  }
>>> > }
>

Does inputstreamreader.readLine refer to this function:

http://java.sun.com/j2se/1.5.0/docs/api/java/io/BufferedReader.html#readLine%28%29

If so, and that's returning null, then is it signaling that CouchDB
has tried to close the input stream?

Paul

> I put logs in the true branch of if (s == null) and moments later I found a
> log created of size 10 MB where the view server gets null strings from
> stdin. This may give some clues towards the problem.
>
> Hope this helps.
> - Debasish
>
>
>
>
>
> On Sun, Sep 20, 2009 at 10:56 AM, Chris Anderson <jchris@apache.org> wrote:
>
>> On Sat, Sep 19, 2009 at 10:09 PM, Debasish Ghosh
>> <ghosh.debasish@gmail.com> wrote:
>> > Yes, actually the reason I brought it up is that the same query server
>> runs
>> > fine with the earlier version, while it stumbles with the changes
>> > incorporated later. Actually there is a really really big difference in
>> > performance which is primarily because of the timeouts. Thanks for
>> deciding
>> > to look into it. I will currently stick around with the April
>> > snapshot.Please post your findings on this list - I will be happy to
>> upgrade
>> > to the latest.
>> > Thanks.
>> > - Debasish
>>
>> I think what we'll need is a way to get visibility between the beam
>> process and the query server. this could be accomplished with a simple
>> log wrapper around the query server, logging both stdin and stdout to
>> individual files.
>>
>> I like the idea of implementing it as a wrapper because then we can
>> wrap it around the scala as well as the JS query server (and other
>> languages), and get complete transparency into what's going over the
>> wire.
>>
>> This is definitely turning into dev@ territory so I'm moving this thread
>> there.
>>
>> Chris
>>
>> >
>> > On Sun, Sep 20, 2009 at 3:41 AM, Chris Anderson <jchris@apache.org>
>> wrote:
>> >
>> >> On Sat, Sep 19, 2009 at 11:40 AM, Debasish Ghosh
>> >> <ghosh.debasish@gmail.com> wrote:
>> >> > Here are some additional behavior changes that I am noticing between
>> the
>> >> 2
>> >> > versions ..
>> >>
>> >> The other big change is in couch_os_process, the addition of
>> >> couchspawnkillable - maybe this is acting up on your system.
>> >>
>> >> Partially I'm interested in getting to the bottom of this because it
>> >> could be that it's inefficient with the JS query server, but not
>> >> causing errors, and we just haven't noticed.
>> >>
>> >> > In the newer version, I notice lots of null strings being sent
>> >> continuously
>> >> > from the couchdb server to the view server. My view server loop looks
>> >> like
>> >> > the following :-
>> >> >
>> >> > while (true) {
>> >> >  s = inputstreamreader.readLine
>> >> >  toJson(s) match {
>> >> >    //.. process reset, add_fun etc.
>> >> >  }
>> >> > }
>> >> >
>> >> > With the new version, I find lots of null strings coming in to "s",
>> which
>> >> > makes me include something like the following ..
>> >> >
>> >> > while (true) {
>> >> >  s = inputstreamreader.readLine
>> >> >  if (s == null) // ignore
>> >> >  else
>> >> >  toJson(s) match {
>> >> >    //.. process reset, add_fun etc.
>> >> >  }
>> >> > }
>> >> >
>> >> > And this null business is really huge. Has there been any change in
>> the
>> >> > protocol between the couchdb server and the view server ? I suspect
>> that
>> >> > these null exchanges are taking up lots of cycles which result in
>> process
>> >> > time out in the new version. I do not get this null stuff with the
>> older
>> >> > version. Is there any chance of such happening with the changes that
>> have
>> >> > been done in couch_query_servers.erl ?
>> >> >
>> >> > Thanks.
>> >> > - Debasish
>> >> >
>> >> >
>> >> > On Sat, Sep 19, 2009 at 11:34 PM, Debasish Ghosh
>> >> > <ghosh.debasish@gmail.com>wrote:
>> >> >
>> >> >> actually my ["reset"] is not expensive at all .. it just has a
>> >> array.clear
>> >> >> kind of call.
>> >> >> Just another observation when I run in debug mode I find that there
>> are
>> >> >> quite a few cases of OS Process Error {os_process_error, "OS process
>> >> timed
>> >> >> out."} being recorded in couch.log. I do not get this when I am
>> running
>> >> the
>> >> >> earlier version. However no unnatural things appear in
>> couchdb.stderr.
>> >> My
>> >> >> current setting of os_process_timeout is 20000 .. I guess that's
20
>> secs
>> >> ..
>> >> >>
>> >> >> Thanks.
>> >> >> - Debasish
>> >> >>
>> >> >>
>> >> >> On Sat, Sep 19, 2009 at 10:27 PM, Chris Anderson <jchris@apache.org
>> >> >wrote:
>> >> >>
>> >> >>> On Sat, Sep 19, 2009 at 5:13 AM, Debasish Ghosh
>> >> >>> <ghosh.debasish@gmail.com> wrote:
>> >> >>> > Hi -
>> >> >>> > As I have mentioned previously I have been working on
a Scala
>> driver
>> >> for
>> >> >>> > CouchDB, which also includes a Query Server. I was working
with an
>> >> April
>> >> >>> > snapshot of 2009/04/23. This worked fine for all the views
and
>> >> >>> validations
>> >> >>> > that I have written.Things were running fine and I could
write
>> >> >>> map/reduce
>> >> >>> > and validation functions in Scala.
>> >> >>> > Recently I tried to upgrade to trunk. Suddenly the views
and
>> >> validations
>> >> >>> > became very very slow. After some fact finding, I tried
to poke
>> into
>> >> *
>> >> >>> > couch_query_servers.erl*, since that seemed to be the
obvious area
>> to
>> >> >>> look
>> >> >>> > into. I may be worng though, but it was a blind guess.
>> >> >>> > I noticed that previously I was working with *revision
749852* of
>> the
>> >> >>> file,
>> >> >>> > which delivered the goods for me. Then when I faced problems
with
>> the
>> >> >>> trunk,
>> >> >>> > I started doing a git reset to earlier versions of this
file. Now
>> I
>> >> find
>> >> >>> > that it looks like the performance problem starts from
*revision
>> >> 780165*
>> >> >>> of
>> >> >>> > this file. Have a look at
>> >> >>> >
>> >> >>>
>> >>
>> http://svn.apache.org/viewvc/couchdb/trunk/src/couchdb/couch_query_servers.erl?r1=780165&r2=749852&diff_format=hfor
>> >> >>> > the difference. Looks like there have been some major
changes. I
>> am
>> >> >>> > just
>> >> >>> > wondering if this change has anything to do with the performance
>> >> issue.
>> >> >>> >
>> >> >>>
>> >> >>> A quick scan of that diff suggests that the only real behavior
>> change
>> >> >>> that should effect you is the ["reset"] call for recycled processes.
>> >> >>> Maybe reset is expensive in your implementation?
>> >> >>>
>> >> >>> BTW, have you tried running:
>> >> >>>
>> >> >>> spec test/query_server_spec.rb -f specdoc --color
>> >> >>>
>> >> >>> It should be simple to extend that test suite to test your
scala
>> >> >>> server. If there are patches we can make to make it easier
to
>> >> >>> integrate outside projects with the query server test suite,
I'm
>> happy
>> >> >>> to help there as well.
>> >> >>>
>> >> >>> > Any help, pointer will be appreciated.
>> >> >>> >
>> >> >>> > Thanks.
>> >> >>> > - Debasish
>> >> >>> >
>> >> >>>
>> >> >>>
>> >> >>>
>> >> >>> --
>> >> >>> Chris Anderson
>> >> >>> http://jchrisa.net
>> >> >>> http://couch.io
>> >> >>>
>> >> >>
>> >> >>
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Chris Anderson
>> >> http://jchrisa.net
>> >> http://couch.io
>> >>
>> >
>>
>>
>>
>> --
>> Chris Anderson
>> http://jchrisa.net
>> http://couch.io
>>
>

Mime
View raw message