incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stanley Iriele <siriele...@gmail.com>
Subject Re: Timeout using Erlang views with large documents
Date Thu, 19 Dec 2013 10:43:59 GMT
Haha..no worries... But just because the browser does it that way doesn't
mean you have to...you can even use an update handler to isolate your
logic... Instead of simply inserting it...like injecting type and what not.

I've been using couch db for a while... There has to be a way for that data
to be itemized in a more usable way....its OK for you the database admin to
not no the entire structure but you should expected behavior from the data
you store...basically it's an argument about Implicit schemas vs explicit
ones..... I digress though. When I said joins I meant take a bunch of
docs...traverse them and merge them together into a giant json... And send
it back in chunks... That should work just fine....moral of the story is
you shouldn't take that long to process stuff.. Are you sure your mail
function isn't doing something crazy?

Can you paste that if its not private?

Regards
On Dec 19, 2013 2:32 AM, "david martin" <david.martin@lymegreen.co.uk>
wrote:

> On 19/12/13 10:08, Stanley Iriele wrote:
>
>> Ahh yes yes you're right...hmm...it is possible that they place a guard on
>> the function and call it using the default value or something like
>> that...also...what is your doc so huge?... At that point its easier to
>> manage it in small docs and join them all in some view/ list function
>>
>
> In the scenario I am hypothetically  looking at, you (the database creator
> and administrator)
> are not in control of the structure of the document, it may be of any form
> as long as it is JSON.
>
> I first came upon this analysing the output of Collusion (an analytic
> program that tracks all browsing behaviour /in the browser/).
> This JSON  record becomes very large very quickly.
>
> So if the doc is big, it is big. I do not wish to pre-process it, If the
> browser can handle documents of this size surely CouchDB
> can manipulate them in its native language without being throttled
> arbitrarily.
>
> I also feel that the word join is not really what SQL-less is about.
>
> These are only opinions, and I am often wrong so Your Mileage May Vary
> (ymmv).
>
>  On Dec 19, 2013 2:01 AM, "david martin" <david.martin@lymegreen.co.uk>
>> wrote:
>>
>>  On 19/12/13 03:28, Stanley Iriele wrote:
>>>
>>>  Why did you place quotes  around your timeout?.... Its just the value...
>>>> No
>>>> quotes....
>>>>
>>>>  The 'value' of the timeout "50000000000000" is merely the form or
>>> representation that is reported in the browser by curl.
>>>
>>> In Futon configuration it is also represented in the same way.
>>>
>>> In the Erlang core of CouchDB this string may be used and/or stored in
>>> ETS
>>> in a number of ways.
>>>
>>> An atom using list_to_atom/1,
>>> a list, Erlang strings are lists,
>>> a binary by using list_to_binary/1,
>>> a 'value' using list_to_integer/1 or list_to_float/1,
>>> a term using binary_to_term/1.
>>>
>>> So what you see is not necessarily what you get inside the core.
>>>
>>> Yes you are right that timeout is an integer value in gen_servers and a
>>> typical one is 5000 standing for 5 seconds.
>>>
>>> Thank you for illustrating this seeming anomaly.
>>>
>>>
>>>
>>> A
>>>
>>>  On Dec 18, 2013 2:42 PM, "Robert Newson" <rnewson@apache.org> wrote:
>>>>
>>>>   "There is something hard coded in there and I will find it eventually
>>>>
>>>>> and find why it was put there and by whom."
>>>>>
>>>>> This attitude might discourage people from helping you with your
>>>>> efforts.
>>>>>
>>>>> B.
>>>>>
>>>>>
>>>>> On 18 December 2013 22:33, david martin <david.martin@lymegreen.co.uk>
>>>>> wrote:
>>>>>
>>>>>  On 18/12/13 18:05, Robert Newson wrote:
>>>>>>
>>>>>>  I've confirmed that the native view server honors that timeout,
can
>>>>>>> you tell me what;
>>>>>>>
>>>>>>> curl localhost:5984/_config/couchdb/os_process_timeout
>>>>>>>
>>>>>>>  restart CouchDB  on 1.2 (latest in Ubuntu) then
>>>>>>
>>>>>> curl david:************@localhost:5984/_config/couchdb/os_
>>>>>> process_timeout
>>>>>> "50000000000000"
>>>>>> rerun gives
>>>>>> Error: timeout
>>>>>>
>>>>>> {gen_server,call,
>>>>>>               [<0.200.0>,
>>>>>>                {prompt,[<<"map_doc">>,
>>>>>> {[{<<"_id">>,<<"61c3f496b9e4c8dc29b95270d9000370">>},
>>>>>> {<<"_rev">>,<<"9-e48194151642345e0e3a4a5edfee56e4">>},
>>>>>>                           {<<"test">>,
>>>>>>                            {[{<<"hey">>,
>>>>>>                               {[{<<"_id">>,
>>>>>> <<"61c3f496b9e4c8dc29b95270d9000370">>},........}
>>>>>>
>>>>>> Test JSON here ~16K lines
>>>>>>
>>>>>>  https://friendpaste.com/6LkCbdENAe1gOZlD9DWCod
>>>>>>>
>>>>>>>  Code as in couchdb/erlang  list in "Using the Erlang view server
to
>>>>>>
>>>>>>  Educate
>>>>>
>>>>>  in CouchDB"
>>>>>>
>>>>>> I have looked for this for some time hoping next release would fix
it.
>>>>>> There is something hard coded in there and I will find it eventually
>>>>>> and
>>>>>> find why it was put there and by whom.
>>>>>>
>>>>>>
>>>>>>
>>>>>>   returns? You might need to bounce couchdb in any case, as it applies
>>>>>>
>>>>>>> this timeout setting when it creates the process, and we keep
a pool
>>>>>>> of them around, so changes to timeout after that won't be picked
up
>>>>>>> until they're rebuild. restarting couchdb is the quickest way
to
>>>>>>> ensure that.
>>>>>>>
>>>>>>> B.
>>>>>>>
>>>>>>>
>>>>>>> On 18 December 2013 16:20, david martin <
>>>>>>> david.martin@lymegreen.co.uk>
>>>>>>> wrote:
>>>>>>>
>>>>>>>  Futon on Apache CouchDB 1.2 (according to Futon)
>>>>>>>> {"couchdb":"Welcome","version":"1.2.0"} according to ?
>>>>>>>> CouchDB 1.4.0 Ubuntu according to Package name
>>>>>>>>
>>>>>>>> I set os_process_timeout 50000000000000 (effective infinity).
>>>>>>>>
>>>>>>>>     I ALWAYS get the VERY unhelpful message which merely
prints the
>>>>>>>> document
>>>>>>>> contents.
>>>>>>>>
>>>>>>>> Error: timeout       % yes I know this but cannot do anything
about
>>>>>>>> it
>>>>>>>>
>>>>>>>> {gen_server,call,     % it's in a gen_server yes I know this!
>>>>>>>>                [<0.14190.8>,   % this is its PID yes
I know this!
>>>>>>>>                 {prompt,[<<"map_doc">>,   % it
is a MAP function
>>>>>>>> yes I
>>>>>>>>
>>>>>>>>  know
>>>>>>>
>>>>>> this!
>>>>>>
>>>>>>> {[{<<"_id">>,<<"61c3f496b9e4c8dc29b95270d9000370">>},
% it is the
>>>>>>>> document I
>>>>>>>> am processing, Yes I know this!
>>>>>>>> {<<"_rev">>,<<"9-e48194151642345e0e3a4a5edfee56e4">>},
>>>>>>>>                            .....
>>>>>>>>
>>>>>>>> Yes it is a large and complex document (16K lines to make
this
>>>>>>>> happen
>>>>>>>>
>>>>>>>>  on
>>>>>>>
>>>>>> fast machine much less on Raspberry Pi).
>>>>>>
>>>>>>> Yes it uses Erlang view function.
>>>>>>>> Yes I DO want it to hog resources until it is finished.
>>>>>>>> Yes I am the administrator.
>>>>>>>> No  I AM NOT INTERFERING WITH ANYTHING ELSE.
>>>>>>>> No I cannot dictate how big or small the document is.
>>>>>>>> Yes this is important to me.
>>>>>>>> I have not pursued this as I was using rcouch, I could not
find the
>>>>>>>> source
>>>>>>>> of the timeout message.
>>>>>>>> I did not want to have to rebuild to fix this.
>>>>>>>> I did not want to bother the Couchdb team as I was using
a fork of
>>>>>>>> CouchDB.
>>>>>>>> Simlar issues have been raised and no answers forthcoming.
>>>>>>>> Mentions of "hidden tweaks", "this is not good for you",
"have you
>>>>>>>> got
>>>>>>>> big
>>>>>>>> documents"  etc.
>>>>>>>>
>>>>>>>> How do I get this NOT to timeout?
>>>>>>>>
>>>>>>>> On rcouch I would change a value and rebuild a release to
fix this
>>>>>>>> (if
>>>>>>>>
>>>>>>>>  I
>>>>>>>
>>>>>> could identify the source).
>>>>>>
>>>>>>> If anybody can give a clue I will test their hypothesis and report
>>>>>>>> back
>>>>>>>> to
>>>>>>>> the list.
>>>>>>>>
>>>>>>>> --
>>>>>>>> David Martin
>>>>>>>>
>>>>>>>>
>>>>>>>>  --
>>>>>> David Martin
>>>>>>
>>>>>>
>>>>>>  --
>>> David Martin
>>>
>>>
>>>
>
> --
> David Martin
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message