couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adam Kocoloski <kocol...@apache.org>
Subject Re: chunked encoding problem ? - error messages from curl as well as lucene
Date Tue, 30 Jun 2009 18:47:50 GMT
Hi Nitin, the specific bug I fixed only affected Unicode characters  
outside the Basic Multilingual Plane.   CouchDB would happily accept  
those characters in raw UTF-8 format, and would serve them back to the  
user escaped as UTF-16 surrogate pairs.  However, CouchDB would not  
allow users to upload documents where the characters were already  
escaped.  That's been fixed in 0.9.1

It looks like you've got a different problem.  It might be the case  
that we are too permissive in what we accept as raw UTF-8 in the  
upload.  I don't know.  Best,

Adam

On Jun 30, 2009, at 2:18 PM, Nitin Borwankar wrote:

> Hi Damien,
>
> Thanks for that tip.
>
> Turns out I had non-UTF-8 data
>
> adolfo.steiger-gar%E7%E3o:
>
> - not sure how it managed to get into the db.
>
> This is probably confusing the chunk termination.
>
> How did Couch let this data in ?  I uploaded via Python httplib - not
> couchdb-python.  Is this a bug - the one that is fixed in 0.9.1?
>
> Nitin
>
> 37% of all statistics are made up on the spot
> -------------------------------------------------------------------------------------
> Nitin Borwankar
> nborwankar@gmail.com
>
>
> On Tue, Jun 30, 2009 at 8:58 AM, Damien Katz <damien@apache.org>  
> wrote:
>
>> This might be the json encoding issue that Adam fixed.
>>
>> The 0.9.x branch, which is soon to be 0.9.1, fixes that issue. Try  
>> building
>> and installing from the branch and see if that fixes the problem:
>> svn co http://svn.apache.org/repos/asf/couchdb/branches/0.9.x/
>>
>> -Damien
>>
>>
>>
>> On Jun 30, 2009, at 12:15 AM, Nitin Borwankar wrote:
>>
>> Oh and when I  use Futon and try to browse the docs around where curl
>>> gives
>>> an error,  when I hit the page containing the records around the  
>>> error
>>> Futon
>>> just spins and doesn't render the page.
>>>
>>> Data corruption?
>>>
>>> Nitin
>>>
>>> 37% of all statistics are made up on the spot
>>>
>>> -------------------------------------------------------------------------------------
>>> Nitin Borwankar
>>> nborwankar@gmail.com
>>>
>>>
>>> On Mon, Jun 29, 2009 at 9:11 PM, Nitin Borwankar  
>>> <nitin@borwankar.com
>>>> wrote:
>>>
>>>
>>>> Hi,
>>>>
>>>> I uploaded about 11K + docs total 230MB or so of data to a 0.9  
>>>> instance
>>>> on
>>>> Ubuntu.
>>>> Db name is 'plist'
>>>>
>>>> curl http://localhost:5984/plist gives
>>>>
>>>>
>>>>
>>>> {"db_name":"plist","doc_count":11036,"doc_del_count": 
>>>> 0,"update_seq":11036,"purge_seq":0,
>>>>
>>>>
>>>> "compact_running":false,"disk_size": 
>>>> 243325178,"instance_start_time":"1246228896723181"}
>>>>
>>>> suggesting a non-corrupt db
>>>>
>>>> curl http://localhost:5984/plist/_all_docs gives
>>>>
>>>> {"id":"adnanmoh","key":"adnanmoh","value":{"rev":"1-663736558"}},
>>>>
>>>>
>>>> {"id":"adnen.chockri","key":"adnen.chockri","value": 
>>>> {"rev":"1-1209124545"}},
>>>> curl: (56) Received problem 2 in the chunky
>>>> parser                                          <<--------- note  
>>>> curl
>>>> error
>>>> {"id":"ado.adamu","key":"ado.adamu","value":{"rev":"1-4226951654"}}
>>>>
>>>> suggesting a chunked data transfer error
>>>>
>>>>
>>>> couchdb-lucene error message in couchdb.stderr reads
>>>>
>>>> [...]
>>>>
>>>> [couchdb-lucene] INFO Indexing plist from scratch.
>>>> [couchdb-lucene] ERROR Error updating index.
>>>> java.io.IOException: CRLF expected at end of chunk: 83/101
>>>>  at
>>>>
>>>> org 
>>>> .apache 
>>>> .commons 
>>>> .httpclient.ChunkedInputStream.readCRLF(ChunkedInputStream.java: 
>>>> 207)
>>>>  at
>>>>
>>>> org 
>>>> .apache 
>>>> .commons 
>>>> .httpclient.ChunkedInputStream.nextChunk(ChunkedInputStream.java: 
>>>> 219)
>>>>  at
>>>>
>>>> org 
>>>> .apache 
>>>> .commons 
>>>> .httpclient.ChunkedInputStream.read(ChunkedInputStream.java:176)
>>>>  at
>>>>
>>>> org 
>>>> .apache 
>>>> .commons 
>>>> .httpclient.ChunkedInputStream.read(ChunkedInputStream.java:196)
>>>>  at
>>>>
>>>> org 
>>>> .apache 
>>>> .commons 
>>>> .httpclient 
>>>> .ChunkedInputStream.exhaustInputStream(ChunkedInputStream.java:369)
>>>>  at
>>>>
>>>> org 
>>>> .apache 
>>>> .commons 
>>>> .httpclient.ChunkedInputStream.close(ChunkedInputStream.java:346)
>>>>  at java.io.FilterInputStream.close(FilterInputStream.java:159)
>>>>  at
>>>>
>>>> org 
>>>> .apache 
>>>> .commons 
>>>> .httpclient 
>>>> .AutoCloseInputStream.notifyWatcher(AutoCloseInputStream.java:194)
>>>>  at
>>>>
>>>> org 
>>>> .apache 
>>>> .commons 
>>>> .httpclient.AutoCloseInputStream.close(AutoCloseInputStream.java: 
>>>> 158)
>>>>  at
>>>> com.github.rnewson.couchdb.lucene.Database.execute(Database.java: 
>>>> 141)
>>>>  at com.github.rnewson.couchdb.lucene.Database.get(Database.java: 
>>>> 107)
>>>>  at
>>>>
>>>> com 
>>>> .github 
>>>> .rnewson.couchdb.lucene.Database.getAllDocsBySeq(Database.java:82)
>>>>  at
>>>>
>>>> com.github.rnewson.couchdb.lucene.Index 
>>>> $Indexer.updateDatabase(Index.java:229)
>>>>  at
>>>>
>>>> com.github.rnewson.couchdb.lucene.Index 
>>>> $Indexer.updateIndex(Index.java:178)
>>>>  at com.github.rnewson.couchdb.lucene.Index 
>>>> $Indexer.run(Index.java:90)
>>>>  at java.lang.Thread.run(Thread.java:595)
>>>>
>>>>
>>>> suggesting a chunking problem again.
>>>>
>>>> Who is creating this problem - my data?  CouchDB chunking ?
>>>>
>>>> Help?
>>>>
>>>>
>>>>
>>>> 37% of all statistics are made up on the spot
>>>>
>>>>
>>>> -------------------------------------------------------------------------------------
>>>> Nitin Borwankar
>>>> nborwankar@gmail.com
>>>>
>>>>
>>


Mime
View raw message