couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Paul Davis <paul.joseph.da...@gmail.com>
Subject Re: Query server perfromance issues ..
Date Sun, 20 Sep 2009 07:54:24 GMT
Tom,

Yup. That's git bisect in action. Though I don't see the behavior with
the JS view server on trunk.

Paul

On Sun, Sep 20, 2009 at 3:48 AM, Tom Sante <tom.sante@gmail.com> wrote:
> On Sun, Sep 20, 10:39, Debasish Ghosh wrote:
>> Yes, actually the reason I brought it up is that the same query server runs
>> fine with the earlier version, while it stumbles with the changes
>> incorporated later. Actually there is a really really big difference in
>> performance which is primarily because of the timeouts. Thanks for deciding
>> to look into it. I will currently stick around with the April
>> snapshot.Please post your findings on this list - I will be happy to upgrade
>> to the latest.
>> Thanks.
>> - Debasish
>>
>> On Sun, Sep 20, 2009 at 3:41 AM, Chris Anderson <jchris@apache.org> wrote:
>>
>> > On Sat, Sep 19, 2009 at 11:40 AM, Debasish Ghosh
>> > <ghosh.debasish@gmail.com> wrote:
>> > > Here are some additional behavior changes that I am noticing between the
>> > 2
>> > > versions ..
>> >
>> > The other big change is in couch_os_process, the addition of
>> > couchspawnkillable - maybe this is acting up on your system.
>> >
>> > Partially I'm interested in getting to the bottom of this because it
>> > could be that it's inefficient with the JS query server, but not
>> > causing errors, and we just haven't noticed.
>> >
>> > > In the newer version, I notice lots of null strings being sent
>> > continuously
>> > > from the couchdb server to the view server. My view server loop looks
>> > like
>> > > the following :-
>> > >
>> > > while (true) {
>> > >  s = inputstreamreader.readLine
>> > >  toJson(s) match {
>> > >    //.. process reset, add_fun etc.
>> > >  }
>> > > }
>> > >
>> > > With the new version, I find lots of null strings coming in to "s", which
>> > > makes me include something like the following ..
>> > >
>> > > while (true) {
>> > >  s = inputstreamreader.readLine
>> > >  if (s == null) // ignore
>> > >  else
>> > >  toJson(s) match {
>> > >    //.. process reset, add_fun etc.
>> > >  }
>> > > }
>> > >
>> > > And this null business is really huge. Has there been any change in the
>> > > protocol between the couchdb server and the view server ? I suspect that
>> > > these null exchanges are taking up lots of cycles which result in process
>> > > time out in the new version. I do not get this null stuff with the older
>> > > version. Is there any chance of such happening with the changes that have
>> > > been done in couch_query_servers.erl ?
>> > >
>> > > Thanks.
>> > > - Debasish
>> > >
>> > >
>> > > On Sat, Sep 19, 2009 at 11:34 PM, Debasish Ghosh
>> > > <ghosh.debasish@gmail.com>wrote:
>> > >
>> > >> actually my ["reset"] is not expensive at all .. it just has a
>> > array.clear
>> > >> kind of call.
>> > >> Just another observation when I run in debug mode I find that there
are
>> > >> quite a few cases of OS Process Error {os_process_error, "OS process
>> > timed
>> > >> out."} being recorded in couch.log. I do not get this when I am running
>> > the
>> > >> earlier version. However no unnatural things appear in couchdb.stderr.
>> > My
>> > >> current setting of os_process_timeout is 20000 .. I guess that's 20
secs
>> > ..
>> > >>
>> > >> Thanks.
>> > >> - Debasish
>> > >>
>> > >>
>> > >> On Sat, Sep 19, 2009 at 10:27 PM, Chris Anderson <jchris@apache.org
>> > >wrote:
>> > >>
>> > >>> On Sat, Sep 19, 2009 at 5:13 AM, Debasish Ghosh
>> > >>> <ghosh.debasish@gmail.com> wrote:
>> > >>> > Hi -
>> > >>> > As I have mentioned previously I have been working on a Scala
driver
>> > for
>> > >>> > CouchDB, which also includes a Query Server. I was working
with an
>> > April
>> > >>> > snapshot of 2009/04/23. This worked fine for all the views
and
>> > >>> validations
>> > >>> > that I have written.Things were running fine and I could write
>> > >>> map/reduce
>> > >>> > and validation functions in Scala.
>> > >>> > Recently I tried to upgrade to trunk. Suddenly the views and
>> > validations
>> > >>> > became very very slow. After some fact finding, I tried to
poke into
>> > *
>> > >>> > couch_query_servers.erl*, since that seemed to be the obvious
area to
>> > >>> look
>> > >>> > into. I may be worng though, but it was a blind guess.
>> > >>> > I noticed that previously I was working with *revision 749852*
of the
>> > >>> file,
>> > >>> > which delivered the goods for me. Then when I faced problems
with the
>> > >>> trunk,
>> > >>> > I started doing a git reset to earlier versions of this file.
Now I
>> > find
>> > >>> > that it looks like the performance problem starts from *revision
>> > 780165*
>> > >>> of
>> > >>> > this file. Have a look at
>> > >>> >
>> > >>>
>> > http://svn.apache.org/viewvc/couchdb/trunk/src/couchdb/couch_query_servers.erl?r1=780165&r2=749852&diff_format=hfor
>> > >>> > the difference. Looks like there have been some major changes.
I am
>> > >>> > just
>> > >>> > wondering if this change has anything to do with the performance
>> > issue.
>> > >>> >
>> > >>>
>> > >>> A quick scan of that diff suggests that the only real behavior
change
>> > >>> that should effect you is the ["reset"] call for recycled processes.
>> > >>> Maybe reset is expensive in your implementation?
>> > >>>
>> > >>> BTW, have you tried running:
>> > >>>
>> > >>> spec test/query_server_spec.rb -f specdoc --color
>> > >>>
>> > >>> It should be simple to extend that test suite to test your scala
>> > >>> server. If there are patches we can make to make it easier to
>> > >>> integrate outside projects with the query server test suite, I'm
happy
>> > >>> to help there as well.
>> > >>>
>> > >>> > Any help, pointer will be appreciated.
>> > >>> >
>> > >>> > Thanks.
>> > >>> > - Debasish
>> > >>> >
>> > >>>
>> > >>>
>> > >>>
>> > >>>
>> > >>
>> > >>
>> > >
>> >
>> >
>> >
>> >
>
> Just my 2 cents:
> To further isolate which change causes this issue. You could do
> incremental upgrades of you trunk starting from here it was fine,
> and each time compile and test until you come across the first one
> rev. that "breaks" and does show the view server issues.
> (like Regression testing)
> Could be easier than manually reviewing each code change since April
> that could cause this.
>
> Tom
>

Mime
View raw message