Return-Path: Delivered-To: apmail-jakarta-lucene-user-archive@www.apache.org Received: (qmail 9601 invoked from network); 12 Oct 2004 00:51:40 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur-2.apache.org with SMTP; 12 Oct 2004 00:51:40 -0000 Received: (qmail 59404 invoked by uid 500); 12 Oct 2004 00:51:29 -0000 Delivered-To: apmail-jakarta-lucene-user-archive@jakarta.apache.org Received: (qmail 59377 invoked by uid 500); 12 Oct 2004 00:51:28 -0000 Mailing-List: contact lucene-user-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Lucene Users List" Reply-To: "Lucene Users List" Delivered-To: mailing list lucene-user@jakarta.apache.org Received: (qmail 59364 invoked by uid 99); 12 Oct 2004 00:51:28 -0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests=RCVD_BY_IP,SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (hermes.apache.org: domain of fraschetti@gmail.com designates 64.233.170.200 as permitted sender) Received: from [64.233.170.200] (HELO mproxy.gmail.com) (64.233.170.200) by apache.org (qpsmtpd/0.28) with ESMTP; Mon, 11 Oct 2004 17:51:28 -0700 Received: by mproxy.gmail.com with SMTP id 75so200631rnk for ; Mon, 11 Oct 2004 17:51:26 -0700 (PDT) Received: by 10.38.152.63 with SMTP id z63mr1520491rnd; Mon, 11 Oct 2004 17:50:15 -0700 (PDT) Received: by 10.38.72.31 with HTTP; Mon, 11 Oct 2004 17:50:15 -0700 (PDT) Message-ID: Date: Mon, 11 Oct 2004 17:50:15 -0700 From: Chris Fraschetti Reply-To: Chris Fraschetti To: Lucene Users List Subject: single quote unicode character Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked X-Spam-Rating: minotaur-2.apache.org 1.6.2 0/1000/N The dataset that I index is pretty dynamic and flexible, and I started to notice a incorrectly displayed character on some of my results... some debugging showed that it was a the Unicode character for single quote which is 8217 decimal. As far as I know, everything is fine before I index, but when retrieving the content, I receive a character that cannot be displayed on the java servlet I use to display them. How can I make lucene be vary general and accept and return all encoded/non-encoded chars are they were in their original state? -- ___________________________________________________ Chris Fraschetti e fraschetti@gmail.com --------------------------------------------------------------------- To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org For additional commands, e-mail: lucene-user-help@jakarta.apache.org