incubator-couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zachary Zolton <zachary.zol...@gmail.com>
Subject Re: chunked encoding problem ? - error messages from curl as well as lucene
Date Wed, 01 Jul 2009 16:25:43 GMT
LOL! Yet another URL handler I have never heard of!?

(Not listed in the httpd_global_handlers section of the config, either...)

So, what's the semantic difference between _purge and DELETE of a document?

On Wed, Jul 1, 2009 at 11:21 AM, Damien Katz<damien@apache.org> wrote:
> Nitin, I would try to purge the bad document, using the _purge api (deleting
> the document can still cause problems as we'll keep around a deletion stub
> with the bad id), then things should be fixed. But you'll have to know the
> rev id of the document to use it, which might be hard to get via http.
>
> Purge:
>
> POST /db/_purge
> {"thedocid": "therevid"}
>
> Unless somehow the file got corrupted, this is definitely a CouchDB bug, we
> shouldn't accept a string we can't later return to the caller. Can you
> create a bug report? Adding failing test case would be the best, but
> attaching the bad string will also do.
>
> -Damien
>
>
> On Jun 30, 2009, at 2:47 PM, Adam Kocoloski wrote:
>
>> Hi Nitin, the specific bug I fixed only affected Unicode characters
>> outside the Basic Multilingual Plane.   CouchDB would happily accept those
>> characters in raw UTF-8 format, and would serve them back to the user
>> escaped as UTF-16 surrogate pairs.  However, CouchDB would not allow users
>> to upload documents where the characters were already escaped.  That's been
>> fixed in 0.9.1
>>
>> It looks like you've got a different problem.  It might be the case that
>> we are too permissive in what we accept as raw UTF-8 in the upload.  I don't
>> know.  Best,
>>
>> Adam
>>
>> On Jun 30, 2009, at 2:18 PM, Nitin Borwankar wrote:
>>
>>> Hi Damien,
>>>
>>> Thanks for that tip.
>>>
>>> Turns out I had non-UTF-8 data
>>>
>>> adolfo.steiger-gar%E7%E3o:
>>>
>>> - not sure how it managed to get into the db.
>>>
>>> This is probably confusing the chunk termination.
>>>
>>> How did Couch let this data in ?  I uploaded via Python httplib - not
>>> couchdb-python.  Is this a bug - the one that is fixed in 0.9.1?
>>>
>>> Nitin
>>>
>>> 37% of all statistics are made up on the spot
>>>
>>> -------------------------------------------------------------------------------------
>>> Nitin Borwankar
>>> nborwankar@gmail.com
>>>
>>>
>>> On Tue, Jun 30, 2009 at 8:58 AM, Damien Katz <damien@apache.org> wrote:
>>>
>>>> This might be the json encoding issue that Adam fixed.
>>>>
>>>> The 0.9.x branch, which is soon to be 0.9.1, fixes that issue. Try
>>>> building
>>>> and installing from the branch and see if that fixes the problem:
>>>> svn co http://svn.apache.org/repos/asf/couchdb/branches/0.9.x/
>>>>
>>>> -Damien
>>>>
>>>>
>>>>
>>>> On Jun 30, 2009, at 12:15 AM, Nitin Borwankar wrote:
>>>>
>>>> Oh and when I  use Futon and try to browse the docs around where curl
>>>>>
>>>>> gives
>>>>> an error,  when I hit the page containing the records around the error
>>>>> Futon
>>>>> just spins and doesn't render the page.
>>>>>
>>>>> Data corruption?
>>>>>
>>>>> Nitin
>>>>>
>>>>> 37% of all statistics are made up on the spot
>>>>>
>>>>>
>>>>> -------------------------------------------------------------------------------------
>>>>> Nitin Borwankar
>>>>> nborwankar@gmail.com
>>>>>
>>>>>
>>>>> On Mon, Jun 29, 2009 at 9:11 PM, Nitin Borwankar <nitin@borwankar.com
>>>>>>
>>>>>> wrote:
>>>>>
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I uploaded about 11K + docs total 230MB or so of data to a 0.9
>>>>>> instance
>>>>>> on
>>>>>> Ubuntu.
>>>>>> Db name is 'plist'
>>>>>>
>>>>>> curl http://localhost:5984/plist gives
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> {"db_name":"plist","doc_count":11036,"doc_del_count":0,"update_seq":11036,"purge_seq":0,
>>>>>>
>>>>>>
>>>>>>
>>>>>> "compact_running":false,"disk_size":243325178,"instance_start_time":"1246228896723181"}
>>>>>>
>>>>>> suggesting a non-corrupt db
>>>>>>
>>>>>> curl http://localhost:5984/plist/_all_docs gives
>>>>>>
>>>>>> {"id":"adnanmoh","key":"adnanmoh","value":{"rev":"1-663736558"}},
>>>>>>
>>>>>>
>>>>>>
>>>>>> {"id":"adnen.chockri","key":"adnen.chockri","value":{"rev":"1-1209124545"}},
>>>>>> curl: (56) Received problem 2 in the chunky
>>>>>> parser                                        
 <<--------- note curl
>>>>>> error
>>>>>> {"id":"ado.adamu","key":"ado.adamu","value":{"rev":"1-4226951654"}}
>>>>>>
>>>>>> suggesting a chunked data transfer error
>>>>>>
>>>>>>
>>>>>> couchdb-lucene error message in couchdb.stderr reads
>>>>>>
>>>>>> [...]
>>>>>>
>>>>>> [couchdb-lucene] INFO Indexing plist from scratch.
>>>>>> [couchdb-lucene] ERROR Error updating index.
>>>>>> java.io.IOException: CRLF expected at end of chunk: 83/101
>>>>>> at
>>>>>>
>>>>>>
>>>>>> org.apache.commons.httpclient.ChunkedInputStream.readCRLF(ChunkedInputStream.java:207)
>>>>>> at
>>>>>>
>>>>>>
>>>>>> org.apache.commons.httpclient.ChunkedInputStream.nextChunk(ChunkedInputStream.java:219)
>>>>>> at
>>>>>>
>>>>>>
>>>>>> org.apache.commons.httpclient.ChunkedInputStream.read(ChunkedInputStream.java:176)
>>>>>> at
>>>>>>
>>>>>>
>>>>>> org.apache.commons.httpclient.ChunkedInputStream.read(ChunkedInputStream.java:196)
>>>>>> at
>>>>>>
>>>>>>
>>>>>> org.apache.commons.httpclient.ChunkedInputStream.exhaustInputStream(ChunkedInputStream.java:369)
>>>>>> at
>>>>>>
>>>>>>
>>>>>> org.apache.commons.httpclient.ChunkedInputStream.close(ChunkedInputStream.java:346)
>>>>>> at java.io.FilterInputStream.close(FilterInputStream.java:159)
>>>>>> at
>>>>>>
>>>>>>
>>>>>> org.apache.commons.httpclient.AutoCloseInputStream.notifyWatcher(AutoCloseInputStream.java:194)
>>>>>> at
>>>>>>
>>>>>>
>>>>>> org.apache.commons.httpclient.AutoCloseInputStream.close(AutoCloseInputStream.java:158)
>>>>>> at
>>>>>> com.github.rnewson.couchdb.lucene.Database.execute(Database.java:141)
>>>>>> at com.github.rnewson.couchdb.lucene.Database.get(Database.java:107)
>>>>>> at
>>>>>>
>>>>>>
>>>>>> com.github.rnewson.couchdb.lucene.Database.getAllDocsBySeq(Database.java:82)
>>>>>> at
>>>>>>
>>>>>>
>>>>>> com.github.rnewson.couchdb.lucene.Index$Indexer.updateDatabase(Index.java:229)
>>>>>> at
>>>>>>
>>>>>>
>>>>>> com.github.rnewson.couchdb.lucene.Index$Indexer.updateIndex(Index.java:178)
>>>>>> at com.github.rnewson.couchdb.lucene.Index$Indexer.run(Index.java:90)
>>>>>> at java.lang.Thread.run(Thread.java:595)
>>>>>>
>>>>>>
>>>>>> suggesting a chunking problem again.
>>>>>>
>>>>>> Who is creating this problem - my data?  CouchDB chunking ?
>>>>>>
>>>>>> Help?
>>>>>>
>>>>>>
>>>>>>
>>>>>> 37% of all statistics are made up on the spot
>>>>>>
>>>>>>
>>>>>>
>>>>>> -------------------------------------------------------------------------------------
>>>>>> Nitin Borwankar
>>>>>> nborwankar@gmail.com
>>>>>>
>>>>>>
>>>>
>>
>
>

Mime
View raw message