incubator-couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Damien Katz <dam...@apache.org>
Subject Re: chunked encoding problem ? - error messages from curl as well as lucene
Date Wed, 01 Jul 2009 16:41:37 GMT
There is the purge test case:
https://svn.apache.org/repos/asf/couchdb/trunk/share/www/script/test/purge.js

Purge removes a document completely from the database, where delete  
puts the document into a "deleted" state. The reason is replication,  
to replicate a document deletion you need record of it. But when  you  
purge a document, the document meta data is removed to, so it's not  
possible to replicate a document purge.

-Damien

On Jul 1, 2009, at 12:25 PM, Zachary Zolton wrote:

> LOL! Yet another URL handler I have never heard of!?
>
> (Not listed in the httpd_global_handlers section of the config,  
> either...)
>
> So, what's the semantic difference between _purge and DELETE of a  
> document?
>
> On Wed, Jul 1, 2009 at 11:21 AM, Damien Katz<damien@apache.org> wrote:
>> Nitin, I would try to purge the bad document, using the _purge api  
>> (deleting
>> the document can still cause problems as we'll keep around a  
>> deletion stub
>> with the bad id), then things should be fixed. But you'll have to  
>> know the
>> rev id of the document to use it, which might be hard to get via  
>> http.
>>
>> Purge:
>>
>> POST /db/_purge
>> {"thedocid": "therevid"}
>>
>> Unless somehow the file got corrupted, this is definitely a CouchDB  
>> bug, we
>> shouldn't accept a string we can't later return to the caller. Can  
>> you
>> create a bug report? Adding failing test case would be the best, but
>> attaching the bad string will also do.
>>
>> -Damien
>>
>>
>> On Jun 30, 2009, at 2:47 PM, Adam Kocoloski wrote:
>>
>>> Hi Nitin, the specific bug I fixed only affected Unicode characters
>>> outside the Basic Multilingual Plane.   CouchDB would happily  
>>> accept those
>>> characters in raw UTF-8 format, and would serve them back to the  
>>> user
>>> escaped as UTF-16 surrogate pairs.  However, CouchDB would not  
>>> allow users
>>> to upload documents where the characters were already escaped.   
>>> That's been
>>> fixed in 0.9.1
>>>
>>> It looks like you've got a different problem.  It might be the  
>>> case that
>>> we are too permissive in what we accept as raw UTF-8 in the  
>>> upload.  I don't
>>> know.  Best,
>>>
>>> Adam
>>>
>>> On Jun 30, 2009, at 2:18 PM, Nitin Borwankar wrote:
>>>
>>>> Hi Damien,
>>>>
>>>> Thanks for that tip.
>>>>
>>>> Turns out I had non-UTF-8 data
>>>>
>>>> adolfo.steiger-gar%E7%E3o:
>>>>
>>>> - not sure how it managed to get into the db.
>>>>
>>>> This is probably confusing the chunk termination.
>>>>
>>>> How did Couch let this data in ?  I uploaded via Python httplib -  
>>>> not
>>>> couchdb-python.  Is this a bug - the one that is fixed in 0.9.1?
>>>>
>>>> Nitin
>>>>
>>>> 37% of all statistics are made up on the spot
>>>>
>>>> -------------------------------------------------------------------------------------
>>>> Nitin Borwankar
>>>> nborwankar@gmail.com
>>>>
>>>>
>>>> On Tue, Jun 30, 2009 at 8:58 AM, Damien Katz <damien@apache.org>  
>>>> wrote:
>>>>
>>>>> This might be the json encoding issue that Adam fixed.
>>>>>
>>>>> The 0.9.x branch, which is soon to be 0.9.1, fixes that issue. Try
>>>>> building
>>>>> and installing from the branch and see if that fixes the problem:
>>>>> svn co http://svn.apache.org/repos/asf/couchdb/branches/0.9.x/
>>>>>
>>>>> -Damien
>>>>>
>>>>>
>>>>>
>>>>> On Jun 30, 2009, at 12:15 AM, Nitin Borwankar wrote:
>>>>>
>>>>> Oh and when I  use Futon and try to browse the docs around where  
>>>>> curl
>>>>>>
>>>>>> gives
>>>>>> an error,  when I hit the page containing the records around  
>>>>>> the error
>>>>>> Futon
>>>>>> just spins and doesn't render the page.
>>>>>>
>>>>>> Data corruption?
>>>>>>
>>>>>> Nitin
>>>>>>
>>>>>> 37% of all statistics are made up on the spot
>>>>>>
>>>>>>
>>>>>> -------------------------------------------------------------------------------------
>>>>>> Nitin Borwankar
>>>>>> nborwankar@gmail.com
>>>>>>
>>>>>>
>>>>>> On Mon, Jun 29, 2009 at 9:11 PM, Nitin Borwankar <nitin@borwankar.com
>>>>>>>
>>>>>>> wrote:
>>>>>>
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I uploaded about 11K + docs total 230MB or so of data to a 0.9
>>>>>>> instance
>>>>>>> on
>>>>>>> Ubuntu.
>>>>>>> Db name is 'plist'
>>>>>>>
>>>>>>> curl http://localhost:5984/plist gives
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> {"db_name":"plist","doc_count":11036,"doc_del_count": 
>>>>>>> 0,"update_seq":11036,"purge_seq":0,
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> "compact_running":false,"disk_size": 
>>>>>>> 243325178,"instance_start_time":"1246228896723181"}
>>>>>>>
>>>>>>> suggesting a non-corrupt db
>>>>>>>
>>>>>>> curl http://localhost:5984/plist/_all_docs gives
>>>>>>>
>>>>>>> {"id":"adnanmoh","key":"adnanmoh","value": 
>>>>>>> {"rev":"1-663736558"}},
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> {"id":"adnen.chockri","key":"adnen.chockri","value": 
>>>>>>> {"rev":"1-1209124545"}},
>>>>>>> curl: (56) Received problem 2 in the chunky
>>>>>>> parser                                          <<---------
 
>>>>>>> note curl
>>>>>>> error
>>>>>>> {"id":"ado.adamu","key":"ado.adamu","value": 
>>>>>>> {"rev":"1-4226951654"}}
>>>>>>>
>>>>>>> suggesting a chunked data transfer error
>>>>>>>
>>>>>>>
>>>>>>> couchdb-lucene error message in couchdb.stderr reads
>>>>>>>
>>>>>>> [...]
>>>>>>>
>>>>>>> [couchdb-lucene] INFO Indexing plist from scratch.
>>>>>>> [couchdb-lucene] ERROR Error updating index.
>>>>>>> java.io.IOException: CRLF expected at end of chunk: 83/101
>>>>>>> at
>>>>>>>
>>>>>>>
>>>>>>> org 
>>>>>>> .apache 
>>>>>>> .commons 
>>>>>>> .httpclient 
>>>>>>> .ChunkedInputStream.readCRLF(ChunkedInputStream.java:207)
>>>>>>> at
>>>>>>>
>>>>>>>
>>>>>>> org 
>>>>>>> .apache 
>>>>>>> .commons 
>>>>>>> .httpclient 
>>>>>>> .ChunkedInputStream.nextChunk(ChunkedInputStream.java:219)
>>>>>>> at
>>>>>>>
>>>>>>>
>>>>>>> org 
>>>>>>> .apache 
>>>>>>> .commons 
>>>>>>> .httpclient.ChunkedInputStream.read(ChunkedInputStream.java:176)
>>>>>>> at
>>>>>>>
>>>>>>>
>>>>>>> org 
>>>>>>> .apache 
>>>>>>> .commons 
>>>>>>> .httpclient.ChunkedInputStream.read(ChunkedInputStream.java:196)
>>>>>>> at
>>>>>>>
>>>>>>>
>>>>>>> org 
>>>>>>> .apache 
>>>>>>> .commons 
>>>>>>> .httpclient 
>>>>>>> .ChunkedInputStream.exhaustInputStream(ChunkedInputStream.java:

>>>>>>> 369)
>>>>>>> at
>>>>>>>
>>>>>>>
>>>>>>> org 
>>>>>>> .apache 
>>>>>>> .commons 
>>>>>>> .httpclient.ChunkedInputStream.close(ChunkedInputStream.java:

>>>>>>> 346)
>>>>>>> at java.io.FilterInputStream.close(FilterInputStream.java:159)
>>>>>>> at
>>>>>>>
>>>>>>>
>>>>>>> org 
>>>>>>> .apache 
>>>>>>> .commons 
>>>>>>> .httpclient 
>>>>>>> .AutoCloseInputStream.notifyWatcher(AutoCloseInputStream.java:

>>>>>>> 194)
>>>>>>> at
>>>>>>>
>>>>>>>
>>>>>>> org 
>>>>>>> .apache 
>>>>>>> .commons 
>>>>>>> .httpclient 
>>>>>>> .AutoCloseInputStream.close(AutoCloseInputStream.java:158)
>>>>>>> at
>>>>>>> com 
>>>>>>> .github.rnewson.couchdb.lucene.Database.execute(Database.java:

>>>>>>> 141)
>>>>>>> at  
>>>>>>> com.github.rnewson.couchdb.lucene.Database.get(Database.java:

>>>>>>> 107)
>>>>>>> at
>>>>>>>
>>>>>>>
>>>>>>> com 
>>>>>>> .github 
>>>>>>> .rnewson.couchdb.lucene.Database.getAllDocsBySeq(Database.java:

>>>>>>> 82)
>>>>>>> at
>>>>>>>
>>>>>>>
>>>>>>> com.github.rnewson.couchdb.lucene.Index 
>>>>>>> $Indexer.updateDatabase(Index.java:229)
>>>>>>> at
>>>>>>>
>>>>>>>
>>>>>>> com.github.rnewson.couchdb.lucene.Index 
>>>>>>> $Indexer.updateIndex(Index.java:178)
>>>>>>> at com.github.rnewson.couchdb.lucene.Index 
>>>>>>> $Indexer.run(Index.java:90)
>>>>>>> at java.lang.Thread.run(Thread.java:595)
>>>>>>>
>>>>>>>
>>>>>>> suggesting a chunking problem again.
>>>>>>>
>>>>>>> Who is creating this problem - my data?  CouchDB chunking ?
>>>>>>>
>>>>>>> Help?
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> 37% of all statistics are made up on the spot
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> -------------------------------------------------------------------------------------
>>>>>>> Nitin Borwankar
>>>>>>> nborwankar@gmail.com
>>>>>>>
>>>>>>>
>>>>>
>>>
>>
>>


Mime
View raw message