couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Norman Rosner <normanros...@googlemail.com>
Subject Re: Chaining of views/MapReduce
Date Fri, 19 Feb 2010 12:25:30 GMT
Mario!

I read that thread as well. But still you need to be able to make HTTP calls inside of the
map or reduce function. Since my JavaScript skills are limited ;) I'm not able to write something
like that (And right now I don't have the time to learn it in depth right now). So I found
this JS HTTP wrapper in the test.js file from the test suite of couchdb. But anyways I haven't
been able to use it (<- limited JS skills ;) )

In my opinion the map function is read only to the document it handles, which totally makes
sense. But you should be able to create new documents. It should be ok to write those documents
back to the original db. (http://bit.ly/dvnnJH)
But like I said, you need to make HTTP requests inside of the map/reduce function. And in
my situation you have to keep track of the revision of the document. So you need to come up
with some variable that's accessable through all the maps...

Cheers,

norman

On 19.02.2010, at 11:31, Mario Scheliga wrote:

> Hi Norman,
> 
> i found a post by P.Davis that contains a possibility
> 
> http://mail-archives.apache.org/mod_mbox/couchdb-user/200910.mbox/%3Ce2111bbb0910280610q37de08ack316c628a4a48cbcc@mail.gmail.com%3E
> 
> chaining could be done by some serverside logic, where the result is stored in a seperate
db.
> what do you think?
> 
> greetz
> mario
> 
> Am 18.02.2010 um 11:20 schrieb Norman Rosner:
> 
>> Hi Mario,
>> 
>> you're probably right about the map phase and read only and all that. Hovercraft
doesn't look right to me though. I found another thread on the mailinglist (http://markmail.org/thread/ne2ghnwpojxkhalj)
that handles a similar topic I guess.
>> 
>> In Hadoop you can chain jobs (map/reduce phases). So you can take the output of the
first job as the input of the second job and so on. But Hadoop is based on a distributed filesystem,
so the results are merged together into one location after a job is done, so you don't have
to think about the thousand servers ;)
>> 
>> I'll guess I will write a workaround in Java to pipe in all the rows of the view,
extract my stuff, cache it somehow and write it back as a new document to the database. After
that I could check my rows of the view against the newly created doc.
>> 
>> Cheers,
>> norman
>> On 18.02.2010, at 09:01, Mario Scheliga wrote:
>> 
>>> Hi Norman,
>>> 
>>> i think its obvious that this wont be possible with couchdb itself. but i think
hovercraft by jchris can do that for you.
>>> 
>>> http://github.com/jchris/hovercraft
>>> 
>>> otherwise you have to implement the second check after re-reduce on the client
side.
>>> because, map-function only read data and create new ones (in the map only).
>>> writing docs is left to the other processes (put docs or update).
>>> 
>>> could you explain it for me, how you do this in hadoop?
>>> 
>>> greetz
>>> mario
>>> 
>>> Am 17.02.2010 um 23:29 schrieb Norman Rosner:
>>> 
>>>> 
>>>> On 17.02.2010, at 23:15, Mario Scheliga wrote:
>>>> 
>>>>> Hi Norman,
>>>>> 
>>>>> updating a document from map-function its not possible and seems to be
the wrong way.
>>>>> Thinking of map function processing docs seperatly (sandbox), so you
are able to
>>>>> spread the execution over thousand of servers ;-)
>>>> 
>>>> True that! But: suppose I'm just creating/updating one document per couchdb-instance,
that should be ok, right? Because after that, I can easily get all the result documents and
merge them together. I would do it in as similar way in Hadoop. And as far as I read in the
loooong archives of this list, I'm not the only one who wants to do such things.
>>>> 
>>>> cheers,
>>>> norman
>>>> 
>>>>> 
>>>>> greez
>>>>> mario
>>>>> 
>>>>> Am 17.02.2010 um 21:04 schrieb Norman Rosner:
>>>>> 
>>>>>> Hi folks,
>>>>>> 
>>>>>> first, I'll have to admit that I'm kinda new to JavaScript and of
course to CouchDB.  Second I just reuse the subject so I hope it also pops up if anybody searches
for it.
>>>>>> 
>>>>>> As I read chaining of views is not possible yet but it's mentioned
couple of times on the mailing list. So here's what I want to do:
>>>>>> 
>>>>>> 1. Create a list of unique labels/tags/whatever through all of the
documents (e.g. all nouns that are in the documents)
>>>>>> 2. Extract all labels/tags/nouns of each document and check them
again the before calculated result in some kind of way
>>>>>> 
>>>>>> For the second point I created a view which works except of the checking
against the result from point 1. Now I'm trying to solve point 1.
>>>>>> And here my questions begin: How can I create/update a document from
inside of a map function? As I think of it, I'll have to make a HTTP GET to load the document
in each iteration. I found some HTTP stuff in the test.js in the test folder, but I'm not
quite sure how to use it and if it's the right way of thinking? Is there any way of using
global variables throughout 'couchapps' (e.g. through the lib folder and thelike)?
>>>>>> 
>>>>>> Any help of you CouchDB kings would be greatly appreciated!
>>>>>> 
>>>>>> Cheers,
>>>>>> 
>>>>>> norman
>>>>> 
>>>>> 
>>>>> --
>>>>> Sourcegarden GmbH HR: B-104357
>>>>> Steuernummer: 37/167/21214 USt-ID: DE814784953
>>>>> Geschaeftsfuehrer: Mario Scheliga, Rene Otto
>>>>> Bank: Deutsche Bank, BLZ: 10070024, KTO: 0810929
>>>>> Schoenhauser Allee 51, 10437 Berlin
>>>>> 
>>>> 
>>> 
>>> 
>>> --
>>> Sourcegarden GmbH HR: B-104357
>>> Steuernummer: 37/167/21214 USt-ID: DE814784953
>>> Geschaeftsfuehrer: Mario Scheliga, Rene Otto
>>> Bank: Deutsche Bank, BLZ: 10070024, KTO: 0810929
>>> Schoenhauser Allee 51, 10437 Berlin
>>> 
>> 
> 
> 
> --
> Sourcegarden GmbH HR: B-104357
> Steuernummer: 37/167/21214 USt-ID: DE814784953
> Geschaeftsfuehrer: Mario Scheliga, Rene Otto
> Bank: Deutsche Bank, BLZ: 10070024, KTO: 0810929
> Schoenhauser Allee 51, 10437 Berlin
> 


Mime
View raw message