incubator-couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matt Goodall <matt.good...@gmail.com>
Subject Re: _changes line breaks (was Re: _changes resource)
Date Thu, 09 Jul 2009 10:03:57 GMT
2009/7/7 Matt Goodall <matt.goodall@gmail.com>:
> Splitting the discussion of line breaks in the _changes document into
> separate email thread ...
>
> 2009/7/6 Chris Anderson <jchris@apache.org>:
>> On Mon, Jul 6, 2009 at 5:50 AM, Matt Goodall<matt.goodall@gmail.com> wrote:
>
>>> == Line Breaks ==
>>>
>>> If each results item is sent with its ending newline (the "," is sent
>>> with the next item) it would make clients much easier and correct to
>>> write, i.e. buffer bytes until a newline is received, split the
>>> buffer, process the row, repeat. You've still got to remove the ","
>>> from all but the first line but it's in a predictable place. Actually,
>>> I don't believe TCP provides any guarantees that bytes sent are
>>> received in the same chunks so relying on anything other than the
>>> newline is probably flawed.
>>>
>>> It's a trivial change, patch attached.
>>
>> There's a certain elegance to the current system. So far I've been
>> testing in the browser and it works fine. If there's demonstrated
>> problems for a client then we shouldn't hesitate to change it.
>
> Agreed, a comma at the end of the line is much prettier.
>
> The 'changes' tests are very unlikely to highlight any problem because
> there's such a small amount of data being sent (well below the MTU of
> the network device) and a sleep(100) is almost certainly enough to
> allow the data to arrive in the browser. If the tests caused lots of
> data to be sent and the browser was listening for data using a
> onreadystatechange callback we may be lucky enough to hit the problem.
> However, "almost" and "may" are not good words when it comes to
> testing ;-).
>
> Anyway, from experience I believe the only way to prove this is to
> explicitly have the bytes arrive slowly so, when I get a couple of
> minutes, I'll write something simple that will hopefully demonstrate
> the value of the newline terminator.

Attached is a quick and dirty Python script with two versions of
handling a continuous _changes stream:

    * changes_comma_eol works with CouchDB trunk.
    * changes_eol_comma works with a patched CouchDB.

I really haven't exactly tested them extensively but I think both are
correct although I wouldn't be surprised if there are some edge cases
I've missed, especially in the changes_comma_eol version. I think it's
reasonably clear which version is simpler, and therefore less error
prone, for a client to implement

There are definitely some things that could be done to improve the
comma_eol version a little but I wanted to keep the code as simple as
possible and I don't think there's any way of completely avoiding some
unnecessary JSON parsing.

Hope that's useful. I'll create a ticket with the patch and the
example so it doesn't get lost.

- Matt

Mime
View raw message