incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Randall Leeds <randall.le...@gmail.com>
Subject Re: Slow filtered _changes feed
Date Wed, 20 Oct 2010 20:14:11 GMT
Another thing: if you can know that your phone does not need old
changes for initialization, you could query the db (/dbname) and start
your first sync from the current sequence number. You may need old
data though. I don't know.

On Wed, Oct 20, 2010 at 09:26, Cory Zue <czue@dimagi.com> wrote:
> Thanks for the suggestions, all.  I took a look at memcached - turns
> out it's pretty good at this type of thing.  :)  Looks like it works
> fine for my needs.
>
> Cory
>
> On Wed, Oct 20, 2010 at 11:16 AM, Simon Metson
> <simonmetson@googlemail.com> wrote:
>> Hi,
>>        That will improve things, but it's still potentially got to skip
>> through a lot of records to return limit number of records. Say you have a
>> very prolific user and one who is less active. The prolific guy has 1000
>> messages for every one the laid back guy gets. If you're filtering by user
>> you're going to have to go through ~10000 records to return the less active
>> user 10 documents (assuming I'm understanding _changes right...). The
>> problem is not getting the 10 documents out but skipping the 10000 documents
>> that don't match the filter.
>>
>> If the records are split across databases (one per user) you'll only hit the
>> relevant changes to your user. Of course that might not be possible for your
>> use case....
>> Cheers
>> Simon
>>
>>
>> On 20 Oct 2010, at 14:31, mickael.bailly@free.fr wrote:
>>
>>> Hello,
>>>
>>> Perhaps you can use the limit option together with the since option to
>>> retrieve the changes feed. That way, no matter if the application is doing a
>>> first time initialization, or starts from the last known sync, your code
>>> will always be the same :
>>> 1/ retrieve N changes infos starting from sequence S (S = 0 the very first
>>> time)
>>> 2/ do I have N changes in the couchdb response ? if so repeat 1, setting S
>>> to the last seq number of the couchdb response.
>>>
>>> Regards,
>>>
>>> Mickael
>>>
>>> ----- Mail Original -----
>>> De: "Cory Zue" <czue@dimagi.com>
>>> À: "user" <user@couchdb.apache.org>
>>> Envoyé: Mercredi 20 Octobre 2010 14h06:59 GMT +01:00 Amsterdam / Berlin /
>>> Berne / Rome / Stockholm / Vienne
>>> Objet: Re: Slow filtered _changes feed
>>>
>>> Thanks Simon,
>>>
>>> On Wed, Oct 20, 2010 at 6:43 AM, Simon Metson
>>> <simonmetson@googlemail.com> wrote:
>>>>
>>>> Hi,
>>>>       One thought: do you query the last N changes or the whole feed?
If
>>>> I
>>>> apply a simple filter to my test database it is slow to get the full
>>>> result,
>>>> but it's relatviely fast to skip to the more recent changes (the ones
>>>> I've
>>>> already consumed and go from there.
>>>
>>> When the phones sync they provide a token containing information about
>>> the last known sync,which we use to skip ahead in the changes feed.
>>> However, on a first time initialization of the phone it processes the
>>> entire feed, and if this doesn't succeed then subsequent ones likely
>>> won't either.
>>>
>>>> Also, the _changes feed streams the documents down to the client - can
>>>> your
>>>> client/server deal with a streaming response?
>>>
>>> It doesn't yet, although we could theoretically add this.
>>>
>>> My latest plan is to basically save a copy of all the relevant
>>> information in a couch doc after every attempted sync.  In this case
>>> the first operation would still timeout, but if the client retried all
>>> the relevant doc ids could be retrieved from that document and only a
>>> small update would have to be applied.  Since this is only likely a
>>> problem when it is a long time between syncs I think it could work ok
>>> (definitely not ideal, though).  Does this seem sane?
>>>
>>>> Cheers
>>>> Simon
>>>>
>>>> On 20 Oct 2010, at 03:44, Cory Zue wrote:
>>>>
>>>>> Howdy,
>>>>>
>>>>> I'm bringing up a problem I chatted about with a few folks with on IRC
>>>>> today but was unable to solve.  My app is using the _changes feed to
>>>>> detect what updates need to go to particular clients (in this case
>>>>> cell phones) based on some filtered information the phones send up in
>>>>> the sync request.  The flow looks something like:
>>>>>
>>>>> Phone ---HTTP POST---> Server
>>>>> Server ---filtered _changes---> CouchDB
>>>>> [Server prepares couch results for phone]
>>>>> Server ---Data Payload---> Phone
>>>>>
>>>>> All of the above represents a single HTTP POST and response between
>>>>> the phone and server.
>>>>>
>>>>> The problem I am seeing is that hitting the _changes feed from the
>>>>> server is prohibitively slow, and these requests are timing out before
>>>>> the server can send data back down to the phone.
>>>>>
>>>>> I was led to believe on IRC that changing my filter from javascript to
>>>>> erlang would drastically increase performance, but I'm not observing
>>>>> this at all.  In fact the erlang version seems slightly slower.
>>>>>
>>>>> I setup an erlang view server following these instructions:
>>>>> http://wiki.apache.org/couchdb/EnableErlangViews
>>>>>
>>>>> Am I missing something?  Is my erlang so bad as to negate the
>>>>> increased performance gain from switching over?  Was I lied to?  Is
my
>>>>> whole approach dumb and do I need to implement filtered caching inside
>>>>> my server and outside of couch?
>>>>>
>>>>> Any thoughts or feedback would be much appreciated.
>>>>>
>>>>> thanks,
>>>>> Cory
>>>>
>>>>
>>
>>
>

Mime
View raw message