incubator-couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From MK>
Subject when will utf8 handling be fixed?
Date Wed, 08 Jun 2011 16:32:00 GMT
Is there any intention to fix couch's handling of "unusual" unicode
characters?  One of the "unusual" characters is the right single quote
(226,128,153) which is a valid utf8 character and also not very
"unusual" IMO.

I have an interface which allows users to add and edit text in a db
document (again, not very unusual) and this one came up because of
someone cutting and pasting some text from a source which used the
right single quote as an apostrophe (which is just plain common -- in
fact they are used in the online "Definitive Guide").

So I am having to maintain a switch statement which filters out these
characters and replaces them with html entities before they get sent
to couch, which is okay in my case since the documents are just being
used as html pages anyway.

But it's an awkward and unnecessary solution: individual
developers should not have to be dealing with this, proper utf8
handling should be hard coded into couch.   For one thing, it means that
anyone worried about such "unusual" possibilities cannot use
couchapp or couch directly -- data has to be filtered first server side.
Although spidermonkey handles utf8 fine, depending on client side
filtering is not always an alternative. 

Sincerely, MK

"Enthusiasm is not the enemy of the intellect." (said of Irving Howe)
"The angel of history[...]is turned toward the past." (Walter Benjamin)

View raw message