couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cory Zue <>
Subject Re: Slow filtered _changes feed
Date Wed, 20 Oct 2010 16:26:27 GMT
Thanks for the suggestions, all.  I took a look at memcached - turns
out it's pretty good at this type of thing.  :)  Looks like it works
fine for my needs.


On Wed, Oct 20, 2010 at 11:16 AM, Simon Metson
<> wrote:
> Hi,
>        That will improve things, but it's still potentially got to skip
> through a lot of records to return limit number of records. Say you have a
> very prolific user and one who is less active. The prolific guy has 1000
> messages for every one the laid back guy gets. If you're filtering by user
> you're going to have to go through ~10000 records to return the less active
> user 10 documents (assuming I'm understanding _changes right...). The
> problem is not getting the 10 documents out but skipping the 10000 documents
> that don't match the filter.
> If the records are split across databases (one per user) you'll only hit the
> relevant changes to your user. Of course that might not be possible for your
> use case....
> Cheers
> Simon
> On 20 Oct 2010, at 14:31, wrote:
>> Hello,
>> Perhaps you can use the limit option together with the since option to
>> retrieve the changes feed. That way, no matter if the application is doing a
>> first time initialization, or starts from the last known sync, your code
>> will always be the same :
>> 1/ retrieve N changes infos starting from sequence S (S = 0 the very first
>> time)
>> 2/ do I have N changes in the couchdb response ? if so repeat 1, setting S
>> to the last seq number of the couchdb response.
>> Regards,
>> Mickael
>> ----- Mail Original -----
>> De: "Cory Zue" <>
>> À: "user" <>
>> Envoyé: Mercredi 20 Octobre 2010 14h06:59 GMT +01:00 Amsterdam / Berlin /
>> Berne / Rome / Stockholm / Vienne
>> Objet: Re: Slow filtered _changes feed
>> Thanks Simon,
>> On Wed, Oct 20, 2010 at 6:43 AM, Simon Metson
>> <> wrote:
>>> Hi,
>>>       One thought: do you query the last N changes or the whole feed? If
>>> I
>>> apply a simple filter to my test database it is slow to get the full
>>> result,
>>> but it's relatviely fast to skip to the more recent changes (the ones
>>> I've
>>> already consumed and go from there.
>> When the phones sync they provide a token containing information about
>> the last known sync,which we use to skip ahead in the changes feed.
>> However, on a first time initialization of the phone it processes the
>> entire feed, and if this doesn't succeed then subsequent ones likely
>> won't either.
>>> Also, the _changes feed streams the documents down to the client - can
>>> your
>>> client/server deal with a streaming response?
>> It doesn't yet, although we could theoretically add this.
>> My latest plan is to basically save a copy of all the relevant
>> information in a couch doc after every attempted sync.  In this case
>> the first operation would still timeout, but if the client retried all
>> the relevant doc ids could be retrieved from that document and only a
>> small update would have to be applied.  Since this is only likely a
>> problem when it is a long time between syncs I think it could work ok
>> (definitely not ideal, though).  Does this seem sane?
>>> Cheers
>>> Simon
>>> On 20 Oct 2010, at 03:44, Cory Zue wrote:
>>>> Howdy,
>>>> I'm bringing up a problem I chatted about with a few folks with on IRC
>>>> today but was unable to solve.  My app is using the _changes feed to
>>>> detect what updates need to go to particular clients (in this case
>>>> cell phones) based on some filtered information the phones send up in
>>>> the sync request.  The flow looks something like:
>>>> Phone ---HTTP POST---> Server
>>>> Server ---filtered _changes---> CouchDB
>>>> [Server prepares couch results for phone]
>>>> Server ---Data Payload---> Phone
>>>> All of the above represents a single HTTP POST and response between
>>>> the phone and server.
>>>> The problem I am seeing is that hitting the _changes feed from the
>>>> server is prohibitively slow, and these requests are timing out before
>>>> the server can send data back down to the phone.
>>>> I was led to believe on IRC that changing my filter from javascript to
>>>> erlang would drastically increase performance, but I'm not observing
>>>> this at all.  In fact the erlang version seems slightly slower.
>>>> I setup an erlang view server following these instructions:
>>>> Am I missing something?  Is my erlang so bad as to negate the
>>>> increased performance gain from switching over?  Was I lied to?  Is my
>>>> whole approach dumb and do I need to implement filtered caching inside
>>>> my server and outside of couch?
>>>> Any thoughts or feedback would be much appreciated.
>>>> thanks,
>>>> Cory

View raw message