Return-Path: X-Original-To: apmail-tika-dev-archive@www.apache.org Delivered-To: apmail-tika-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C465174E2 for ; Tue, 1 Nov 2011 16:48:18 +0000 (UTC) Received: (qmail 14799 invoked by uid 500); 1 Nov 2011 16:48:18 -0000 Delivered-To: apmail-tika-dev-archive@tika.apache.org Received: (qmail 14769 invoked by uid 500); 1 Nov 2011 16:48:18 -0000 Mailing-List: contact dev-help@tika.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@tika.apache.org Delivered-To: mailing list dev@tika.apache.org Received: (qmail 14761 invoked by uid 99); 1 Nov 2011 16:48:18 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 01 Nov 2011 16:48:18 +0000 X-ASF-Spam-Status: No, hits=-2.3 required=5.0 tests=RCVD_IN_DNSWL_MED,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of nick.burch@alfresco.com designates 207.126.144.137 as permitted sender) Received: from [207.126.144.137] (HELO eu1sys200aog114.obsmtp.com) (207.126.144.137) by apache.org (qpsmtpd/0.29) with SMTP; Tue, 01 Nov 2011 16:48:08 +0000 Received: from zimbra.alfresco.com ([88.151.129.3]) by eu1sys200aob114.postini.com ([207.126.147.11]) with SMTP; Tue, 01 Nov 2011 16:47:48 UTC Received: from localhost (localhost.localdomain [127.0.0.1]) by zimbra.alfresco.com (Postfix) with ESMTP id C9E594140F2 for ; Tue, 1 Nov 2011 16:47:47 +0000 (GMT) X-Virus-Scanned: amavisd-new at unx-d-manc4.tc.ifeltd.com Received: from zimbra.alfresco.com ([127.0.0.1]) by localhost (zimbra.alfresco.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 7TlY2t-6EhxZ for ; Tue, 1 Nov 2011 16:47:47 +0000 (GMT) Received: from urchin.earth.li (urchin.earth.li [212.13.204.73]) (Authenticated sender: nick.burch@alfresco.com) by zimbra.alfresco.com (Postfix) with ESMTP id 6A5E14140E0 for ; Tue, 1 Nov 2011 16:47:47 +0000 (GMT) Date: Tue, 1 Nov 2011 16:47:44 +0000 (GMT) From: Nick Burch X-X-Sender: nick@urchin.earth.li To: dev@tika.apache.org Subject: Re: A problem in the right-to-left languages In-Reply-To: Message-ID: References: User-Agent: Alpine 2.00 (DEB 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Virus-Checked: Checked by ClamAV on apache.org On Tue, 1 Nov 2011, Robert Muir wrote: > it would be nice to look at trying to remove the forked charsetdetection > code too (whatever changes tika has, get them into ICU, etc) I've not had any luck with this - I tried submitting some of our changes back (eg the ebcidic detector) but they didn't seem to want them Nick