incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Newson <rnew...@apache.org>
Subject Re: Iterating all documents
Date Thu, 07 Jul 2011 10:28:28 GMT
I suggest it for two reasons;

1) It's faster (it's much closer to on-disk order)
2) It's resumable (just remember the last 'seq' value you saw).

An update while iterating through _all_docs will simply be missed,
you'll have to loop until there are no docs of the old style left.

B.

On 7 July 2011 11:23, Matt Goodall <matt.goodall@gmail.com> wrote:
> On 7 July 2011 11:17, Robert Newson <rnewson@apache.org> wrote:
>
>> It would be better to read through _changes than _all_docs, surely? :)
>>
>
> Yep, I nearly always use the _changes feed for upgrades.
>
> As long as you make each document upgrade idempotent then it doesn't matter
> if it fails (sometimes schema-less docs are a little more schema-less than
> you expect ;-)) or you need to restart it.
>
> - Matt
>
>
>>
>> B.
>>
>> On 7 July 2011 10:03, Dan Sheedy <sheedydan@gmail.com> wrote:
>> > oh.
>> >
>> > Sorry for the thread high jack. Good haircut btw, no regrets.
>> >
>> > On Thu, Jul 7, 2011 at 5:28 PM, Max Ogden <max@maxogden.com> wrote:
>> >
>> >> you should reconsider the haircut
>> >>
>> >> On Thu, Jul 7, 2011 at 12:25 AM, Dan Sheedy <sheedydan@gmail.com>
>> wrote:
>> >>
>> >> > im heading off. might be late dinner. I'm getting my haircut. should
>> be
>> >> > home
>> >> > 8ish.
>> >> >
>> >> > On Thu, Jul 7, 2011 at 10:53 AM, Patrick Barnes <mrtrick@gmail.com>
>> >> wrote:
>> >> >
>> >> > > Alternatively, how about you page through _all_docs?
>> >> > >
>> >> > > 1. Query http://server:5984/dbname/_**all_docs?limit=100<
>> >> > http://server:5984/dbname/_all_docs?limit=100>
>> >> > >
>> >> > > 2. Process that set. Store the id of the last document.
>> >> > >
>> >> > > 3. Query
>> http://server:5984/dbname/_**all_docs?limit=100&startkey=(**
>> >> > > last_id)&skip=1<
>> >> >
>> http://server:5984/dbname/_all_docs?limit=100&startkey=(last_id)&skip=1
>> >> >to
>> >> > get the next set.
>> >> > >
>> >> > > 4. Repeat 2 and 3 until the returned set is empty.
>> >> > >
>> >> > > If your batch processing has to be able to resume after being
>> >> terminated,
>> >> > > just store the last_id in a file between each set.
>> >> > >
>> >> > > If you documents come from a specific view, you can do that too,
the
>> >> only
>> >> > > difference would be that 'startkey' needs to be the last record's
>> view
>> >> > key,
>> >> > > and you may also need a 'startkey_docid=(last_id)' parameter if
the
>> >> keys
>> >> > are
>> >> > > not unique.
>> >> > >
>> >> > > -Patrick
>> >> > >
>> >> > >
>> >> > > On 7/07/2011 9:41 AM, Matthias Eck wrote:
>> >> > >
>> >> > >> Hello,
>> >> > >>
>> >> > >> I need to add a new field to all documents in my database.
>> >> > >>
>> >> > >> To have a better control I wanted to do this by batches and
defined
>> 2
>> >> > >> views:
>> >> > >> documents_without_newfield
>> >> > >> documents_with_newfield
>> >> > >>
>> >> > >> My idea was to just take the first 100 returned by the
>> >> > >> documents_without_newfield view, calculate the new field for
all of
>> >> > >> them, save them and take the next 100 etc.
>> >> > >>
>> >> > >> As it turns out the views do not seem to be updated immediately,
>> which
>> >> > >> means that the view documents_without_newfield returns about
95
>> >> > >> documents that actually already had the new field calculated
in the
>> >> > >> previous step.
>> >> > >>
>> >> > >> Can I force the view to update immediately so I can iterate
through
>> >> > >> the all documents?
>> >> > >>
>> >> > >> Thanks,
>> >> > >> Matthias
>> >> > >>
>> >> > >>
>> >> >
>> >>
>> >
>>
>

Mime
View raw message