incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mario Scheliga <ma...@sourcegarden.de>
Subject Re: Chaining of views/MapReduce
Date Fri, 19 Feb 2010 12:42:31 GMT
Ahh okay,
why you have to keep track of the revision!?
its possible to create/update documents without revision-check, would  
that help?

greetz
mario

Am 19.02.2010 um 13:25 schrieb Norman Rosner:

> Mario!
>
> I read that thread as well. But still you need to be able to make  
> HTTP calls inside of the map or reduce function. Since my JavaScript  
> skills are limited ;) I'm not able to write something like that (And  
> right now I don't have the time to learn it in depth right now). So  
> I found this JS HTTP wrapper in the test.js file from the test suite  
> of couchdb. But anyways I haven't been able to use it (<- limited JS  
> skills ;) )
>
> In my opinion the map function is read only to the document it  
> handles, which totally makes sense. But you should be able to create  
> new documents. It should be ok to write those documents back to the  
> original db. (http://bit.ly/dvnnJH)
> But like I said, you need to make HTTP requests inside of the map/ 
> reduce function. And in my situation you have to keep track of the  
> revision of the document. So you need to come up with some variable  
> that's accessable through all the maps...
>
> Cheers,
>
> norman
>
> On 19.02.2010, at 11:31, Mario Scheliga wrote:
>
>> Hi Norman,
>>
>> i found a post by P.Davis that contains a possibility
>>
>> http://mail-archives.apache.org/mod_mbox/couchdb-user/200910.mbox/%3Ce2111bbb0910280610q37de08ack316c628a4a48cbcc@mail.gmail.com%3E
>>
>> chaining could be done by some serverside logic, where the result  
>> is stored in a seperate db.
>> what do you think?
>>
>> greetz
>> mario
>>
>> Am 18.02.2010 um 11:20 schrieb Norman Rosner:
>>
>>> Hi Mario,
>>>
>>> you're probably right about the map phase and read only and all  
>>> that. Hovercraft doesn't look right to me though. I found another  
>>> thread on the mailinglist (http://markmail.org/thread/ne2ghnwpojxkhalj 
>>> ) that handles a similar topic I guess.
>>>
>>> In Hadoop you can chain jobs (map/reduce phases). So you can take  
>>> the output of the first job as the input of the second job and so  
>>> on. But Hadoop is based on a distributed filesystem, so the  
>>> results are merged together into one location after a job is done,  
>>> so you don't have to think about the thousand servers ;)
>>>
>>> I'll guess I will write a workaround in Java to pipe in all the  
>>> rows of the view, extract my stuff, cache it somehow and write it  
>>> back as a new document to the database. After that I could check  
>>> my rows of the view against the newly created doc.
>>>
>>> Cheers,
>>> norman
>>> On 18.02.2010, at 09:01, Mario Scheliga wrote:
>>>
>>>> Hi Norman,
>>>>
>>>> i think its obvious that this wont be possible with couchdb  
>>>> itself. but i think hovercraft by jchris can do that for you.
>>>>
>>>> http://github.com/jchris/hovercraft
>>>>
>>>> otherwise you have to implement the second check after re-reduce  
>>>> on the client side.
>>>> because, map-function only read data and create new ones (in the  
>>>> map only).
>>>> writing docs is left to the other processes (put docs or update).
>>>>
>>>> could you explain it for me, how you do this in hadoop?
>>>>
>>>> greetz
>>>> mario
>>>>
>>>> Am 17.02.2010 um 23:29 schrieb Norman Rosner:
>>>>
>>>>>
>>>>> On 17.02.2010, at 23:15, Mario Scheliga wrote:
>>>>>
>>>>>> Hi Norman,
>>>>>>
>>>>>> updating a document from map-function its not possible and  
>>>>>> seems to be the wrong way.
>>>>>> Thinking of map function processing docs seperatly (sandbox),  
>>>>>> so you are able to
>>>>>> spread the execution over thousand of servers ;-)
>>>>>
>>>>> True that! But: suppose I'm just creating/updating one document  
>>>>> per couchdb-instance, that should be ok, right? Because after  
>>>>> that, I can easily get all the result documents and merge them  
>>>>> together. I would do it in as similar way in Hadoop. And as far  
>>>>> as I read in the loooong archives of this list, I'm not the only  
>>>>> one who wants to do such things.
>>>>>
>>>>> cheers,
>>>>> norman
>>>>>
>>>>>>
>>>>>> greez
>>>>>> mario
>>>>>>
>>>>>> Am 17.02.2010 um 21:04 schrieb Norman Rosner:
>>>>>>
>>>>>>> Hi folks,
>>>>>>>
>>>>>>> first, I'll have to admit that I'm kinda new to JavaScript and
 
>>>>>>> of course to CouchDB.  Second I just reuse the subject so I 

>>>>>>> hope it also pops up if anybody searches for it.
>>>>>>>
>>>>>>> As I read chaining of views is not possible yet but it's  
>>>>>>> mentioned couple of times on the mailing list. So here's what
 
>>>>>>> I want to do:
>>>>>>>
>>>>>>> 1. Create a list of unique labels/tags/whatever through all of
 
>>>>>>> the documents (e.g. all nouns that are in the documents)
>>>>>>> 2. Extract all labels/tags/nouns of each document and check 

>>>>>>> them again the before calculated result in some kind of way
>>>>>>>
>>>>>>> For the second point I created a view which works except of 

>>>>>>> the checking against the result from point 1. Now I'm trying
 
>>>>>>> to solve point 1.
>>>>>>> And here my questions begin: How can I create/update a  
>>>>>>> document from inside of a map function? As I think of it, I'll
 
>>>>>>> have to make a HTTP GET to load the document in each  
>>>>>>> iteration. I found some HTTP stuff in the test.js in the test
 
>>>>>>> folder, but I'm not quite sure how to use it and if it's the
 
>>>>>>> right way of thinking? Is there any way of using global  
>>>>>>> variables throughout 'couchapps' (e.g. through the lib folder
 
>>>>>>> and thelike)?
>>>>>>>
>>>>>>> Any help of you CouchDB kings would be greatly appreciated!
>>>>>>>
>>>>>>> Cheers,
>>>>>>>
>>>>>>> norman
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Sourcegarden GmbH HR: B-104357
>>>>>> Steuernummer: 37/167/21214 USt-ID: DE814784953
>>>>>> Geschaeftsfuehrer: Mario Scheliga, Rene Otto
>>>>>> Bank: Deutsche Bank, BLZ: 10070024, KTO: 0810929
>>>>>> Schoenhauser Allee 51, 10437 Berlin
>>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Sourcegarden GmbH HR: B-104357
>>>> Steuernummer: 37/167/21214 USt-ID: DE814784953
>>>> Geschaeftsfuehrer: Mario Scheliga, Rene Otto
>>>> Bank: Deutsche Bank, BLZ: 10070024, KTO: 0810929
>>>> Schoenhauser Allee 51, 10437 Berlin
>>>>
>>>
>>
>>
>> --
>> Sourcegarden GmbH HR: B-104357
>> Steuernummer: 37/167/21214 USt-ID: DE814784953
>> Geschaeftsfuehrer: Mario Scheliga, Rene Otto
>> Bank: Deutsche Bank, BLZ: 10070024, KTO: 0810929
>> Schoenhauser Allee 51, 10437 Berlin
>>
>


--
Sourcegarden GmbH HR: B-104357
Steuernummer: 37/167/21214 USt-ID: DE814784953
Geschaeftsfuehrer: Mario Scheliga, Rene Otto
Bank: Deutsche Bank, BLZ: 10070024, KTO: 0810929
Schoenhauser Allee 51, 10437 Berlin


Mime
View raw message