couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From david martin <david.mar...@lymegreen.co.uk>
Subject Re: Timeout using Erlang views with large documents
Date Thu, 19 Dec 2013 10:32:38 GMT
On 19/12/13 10:08, Stanley Iriele wrote:
> Ahh yes yes you're right...hmm...it is possible that they place a guard on
> the function and call it using the default value or something like
> that...also...what is your doc so huge?... At that point its easier to
> manage it in small docs and join them all in some view/ list function

In the scenario I am hypothetically  looking at, you (the database creator and administrator)
are not in control of the structure of the document, it may be of any form as long as it is
JSON.

I first came upon this analysing the output of Collusion (an analytic program that tracks
all browsing behaviour /in the browser/).
This JSON  record becomes very large very quickly.

So if the doc is big, it is big. I do not wish to pre-process it, If the browser can handle
documents of this size surely CouchDB
can manipulate them in its native language without being throttled arbitrarily.

I also feel that the word join is not really what SQL-less is about.

These are only opinions, and I am often wrong so Your Mileage May Vary (ymmv).

> On Dec 19, 2013 2:01 AM, "david martin" <david.martin@lymegreen.co.uk>
> wrote:
>
>> On 19/12/13 03:28, Stanley Iriele wrote:
>>
>>> Why did you place quotes  around your timeout?.... Its just the value...
>>> No
>>> quotes....
>>>
>> The 'value' of the timeout "50000000000000" is merely the form or
>> representation that is reported in the browser by curl.
>>
>> In Futon configuration it is also represented in the same way.
>>
>> In the Erlang core of CouchDB this string may be used and/or stored in ETS
>> in a number of ways.
>>
>> An atom using list_to_atom/1,
>> a list, Erlang strings are lists,
>> a binary by using list_to_binary/1,
>> a 'value' using list_to_integer/1 or list_to_float/1,
>> a term using binary_to_term/1.
>>
>> So what you see is not necessarily what you get inside the core.
>>
>> Yes you are right that timeout is an integer value in gen_servers and a
>> typical one is 5000 standing for 5 seconds.
>>
>> Thank you for illustrating this seeming anomaly.
>>
>>
>>
>> A
>>
>>> On Dec 18, 2013 2:42 PM, "Robert Newson" <rnewson@apache.org> wrote:
>>>
>>>   "There is something hard coded in there and I will find it eventually
>>>> and find why it was put there and by whom."
>>>>
>>>> This attitude might discourage people from helping you with your efforts.
>>>>
>>>> B.
>>>>
>>>>
>>>> On 18 December 2013 22:33, david martin <david.martin@lymegreen.co.uk>
>>>> wrote:
>>>>
>>>>> On 18/12/13 18:05, Robert Newson wrote:
>>>>>
>>>>>> I've confirmed that the native view server honors that timeout, can
>>>>>> you tell me what;
>>>>>>
>>>>>> curl localhost:5984/_config/couchdb/os_process_timeout
>>>>>>
>>>>> restart CouchDB  on 1.2 (latest in Ubuntu) then
>>>>>
>>>>> curl david:************@localhost:5984/_config/couchdb/os_
>>>>> process_timeout
>>>>> "50000000000000"
>>>>> rerun gives
>>>>> Error: timeout
>>>>>
>>>>> {gen_server,call,
>>>>>               [<0.200.0>,
>>>>>                {prompt,[<<"map_doc">>,
>>>>> {[{<<"_id">>,<<"61c3f496b9e4c8dc29b95270d9000370">>},
>>>>> {<<"_rev">>,<<"9-e48194151642345e0e3a4a5edfee56e4">>},
>>>>>                           {<<"test">>,
>>>>>                            {[{<<"hey">>,
>>>>>                               {[{<<"_id">>,
>>>>> <<"61c3f496b9e4c8dc29b95270d9000370">>},........}
>>>>>
>>>>> Test JSON here ~16K lines
>>>>>
>>>>>> https://friendpaste.com/6LkCbdENAe1gOZlD9DWCod
>>>>>>
>>>>> Code as in couchdb/erlang  list in "Using the Erlang view server to
>>>>>
>>>> Educate
>>>>
>>>>> in CouchDB"
>>>>>
>>>>> I have looked for this for some time hoping next release would fix it.
>>>>> There is something hard coded in there and I will find it eventually
and
>>>>> find why it was put there and by whom.
>>>>>
>>>>>
>>>>>
>>>>>   returns? You might need to bounce couchdb in any case, as it applies
>>>>>> this timeout setting when it creates the process, and we keep a pool
>>>>>> of them around, so changes to timeout after that won't be picked
up
>>>>>> until they're rebuild. restarting couchdb is the quickest way to
>>>>>> ensure that.
>>>>>>
>>>>>> B.
>>>>>>
>>>>>>
>>>>>> On 18 December 2013 16:20, david martin <david.martin@lymegreen.co.uk>
>>>>>> wrote:
>>>>>>
>>>>>>> Futon on Apache CouchDB 1.2 (according to Futon)
>>>>>>> {"couchdb":"Welcome","version":"1.2.0"} according to ?
>>>>>>> CouchDB 1.4.0 Ubuntu according to Package name
>>>>>>>
>>>>>>> I set os_process_timeout 50000000000000 (effective infinity).
>>>>>>>
>>>>>>>     I ALWAYS get the VERY unhelpful message which merely prints
the
>>>>>>> document
>>>>>>> contents.
>>>>>>>
>>>>>>> Error: timeout       % yes I know this but cannot do anything
about it
>>>>>>>
>>>>>>> {gen_server,call,     % it's in a gen_server yes I know this!
>>>>>>>                [<0.14190.8>,   % this is its PID yes I
know this!
>>>>>>>                 {prompt,[<<"map_doc">>,   % it is
a MAP function yes I
>>>>>>>
>>>>>> know
>>>>> this!
>>>>>>> {[{<<"_id">>,<<"61c3f496b9e4c8dc29b95270d9000370">>},
% it is the
>>>>>>> document I
>>>>>>> am processing, Yes I know this!
>>>>>>> {<<"_rev">>,<<"9-e48194151642345e0e3a4a5edfee56e4">>},
>>>>>>>                            .....
>>>>>>>
>>>>>>> Yes it is a large and complex document (16K lines to make this
happen
>>>>>>>
>>>>>> on
>>>>> fast machine much less on Raspberry Pi).
>>>>>>> Yes it uses Erlang view function.
>>>>>>> Yes I DO want it to hog resources until it is finished.
>>>>>>> Yes I am the administrator.
>>>>>>> No  I AM NOT INTERFERING WITH ANYTHING ELSE.
>>>>>>> No I cannot dictate how big or small the document is.
>>>>>>> Yes this is important to me.
>>>>>>> I have not pursued this as I was using rcouch, I could not find
the
>>>>>>> source
>>>>>>> of the timeout message.
>>>>>>> I did not want to have to rebuild to fix this.
>>>>>>> I did not want to bother the Couchdb team as I was using a fork
of
>>>>>>> CouchDB.
>>>>>>> Simlar issues have been raised and no answers forthcoming.
>>>>>>> Mentions of "hidden tweaks", "this is not good for you", "have
you got
>>>>>>> big
>>>>>>> documents"  etc.
>>>>>>>
>>>>>>> How do I get this NOT to timeout?
>>>>>>>
>>>>>>> On rcouch I would change a value and rebuild a release to fix
this (if
>>>>>>>
>>>>>> I
>>>>> could identify the source).
>>>>>>> If anybody can give a clue I will test their hypothesis and report
>>>>>>> back
>>>>>>> to
>>>>>>> the list.
>>>>>>>
>>>>>>> --
>>>>>>> David Martin
>>>>>>>
>>>>>>>
>>>>> --
>>>>> David Martin
>>>>>
>>>>>
>> --
>> David Martin
>>
>>


-- 
David Martin


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message