couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mario Scheliga <ma...@sourcegarden.de>
Subject Re: Chaining of views/MapReduce
Date Thu, 18 Feb 2010 10:34:43 GMT
Hi Norman,

ahhh, the light is shining bright now. So the Result of Map/Phase-1 is  
the input of Map/Phase-2? I guess this would fit in couchdb and its a  
great feature, but where to store the intermediate-results in the b- 
tree for caching purposes?
Perhaps there have to be another kind of query-server to solve this?  
Because the intermediate-results dont have to be cached, right?

cheers,
mario

Am 18.02.2010 um 11:20 schrieb Norman Rosner:

> Hi Mario,
>
> you're probably right about the map phase and read only and all  
> that. Hovercraft doesn't look right to me though. I found another  
> thread on the mailinglist (http://markmail.org/thread/ 
> ne2ghnwpojxkhalj) that handles a similar topic I guess.
>
> In Hadoop you can chain jobs (map/reduce phases). So you can take  
> the output of the first job as the input of the second job and so  
> on. But Hadoop is based on a distributed filesystem, so the results  
> are merged together into one location after a job is done, so you  
> don't have to think about the thousand servers ;)
>
> I'll guess I will write a workaround in Java to pipe in all the rows  
> of the view, extract my stuff, cache it somehow and write it back as  
> a new document to the database. After that I could check my rows of  
> the view against the newly created doc.
>
> Cheers,
> norman
> On 18.02.2010, at 09:01, Mario Scheliga wrote:
>
>> Hi Norman,
>>
>> i think its obvious that this wont be possible with couchdb itself.  
>> but i think hovercraft by jchris can do that for you.
>>
>> http://github.com/jchris/hovercraft
>>
>> otherwise you have to implement the second check after re-reduce on  
>> the client side.
>> because, map-function only read data and create new ones (in the  
>> map only).
>> writing docs is left to the other processes (put docs or update).
>>
>> could you explain it for me, how you do this in hadoop?
>>
>> greetz
>> mario
>>
>> Am 17.02.2010 um 23:29 schrieb Norman Rosner:
>>
>>>
>>> On 17.02.2010, at 23:15, Mario Scheliga wrote:
>>>
>>>> Hi Norman,
>>>>
>>>> updating a document from map-function its not possible and seems  
>>>> to be the wrong way.
>>>> Thinking of map function processing docs seperatly (sandbox), so  
>>>> you are able to
>>>> spread the execution over thousand of servers ;-)
>>>
>>> True that! But: suppose I'm just creating/updating one document  
>>> per couchdb-instance, that should be ok, right? Because after  
>>> that, I can easily get all the result documents and merge them  
>>> together. I would do it in as similar way in Hadoop. And as far as  
>>> I read in the loooong archives of this list, I'm not the only one  
>>> who wants to do such things.
>>>
>>> cheers,
>>> norman
>>>
>>>>
>>>> greez
>>>> mario
>>>>
>>>> Am 17.02.2010 um 21:04 schrieb Norman Rosner:
>>>>
>>>>> Hi folks,
>>>>>
>>>>> first, I'll have to admit that I'm kinda new to JavaScript and  
>>>>> of course to CouchDB.  Second I just reuse the subject so I hope  
>>>>> it also pops up if anybody searches for it.
>>>>>
>>>>> As I read chaining of views is not possible yet but it's  
>>>>> mentioned couple of times on the mailing list. So here's what I  
>>>>> want to do:
>>>>>
>>>>> 1. Create a list of unique labels/tags/whatever through all of  
>>>>> the documents (e.g. all nouns that are in the documents)
>>>>> 2. Extract all labels/tags/nouns of each document and check them  
>>>>> again the before calculated result in some kind of way
>>>>>
>>>>> For the second point I created a view which works except of the  
>>>>> checking against the result from point 1. Now I'm trying to  
>>>>> solve point 1.
>>>>> And here my questions begin: How can I create/update a document  
>>>>> from inside of a map function? As I think of it, I'll have to  
>>>>> make a HTTP GET to load the document in each iteration. I found  
>>>>> some HTTP stuff in the test.js in the test folder, but I'm not  
>>>>> quite sure how to use it and if it's the right way of thinking?  
>>>>> Is there any way of using global variables throughout  
>>>>> 'couchapps' (e.g. through the lib folder and thelike)?
>>>>>
>>>>> Any help of you CouchDB kings would be greatly appreciated!
>>>>>
>>>>> Cheers,
>>>>>
>>>>> norman
>>>>
>>>>
>>>> --
>>>> Sourcegarden GmbH HR: B-104357
>>>> Steuernummer: 37/167/21214 USt-ID: DE814784953
>>>> Geschaeftsfuehrer: Mario Scheliga, Rene Otto
>>>> Bank: Deutsche Bank, BLZ: 10070024, KTO: 0810929
>>>> Schoenhauser Allee 51, 10437 Berlin
>>>>
>>>
>>
>>
>> --
>> Sourcegarden GmbH HR: B-104357
>> Steuernummer: 37/167/21214 USt-ID: DE814784953
>> Geschaeftsfuehrer: Mario Scheliga, Rene Otto
>> Bank: Deutsche Bank, BLZ: 10070024, KTO: 0810929
>> Schoenhauser Allee 51, 10437 Berlin
>>
>


--
Sourcegarden GmbH HR: B-104357
Steuernummer: 37/167/21214 USt-ID: DE814784953
Geschaeftsfuehrer: Mario Scheliga, Rene Otto
Bank: Deutsche Bank, BLZ: 10070024, KTO: 0810929
Schoenhauser Allee 51, 10437 Berlin


Mime
View raw message