Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 14137 invoked from network); 7 Aug 2009 03:52:17 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 7 Aug 2009 03:52:17 -0000 Received: (qmail 76325 invoked by uid 500); 7 Aug 2009 03:52:21 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 76248 invoked by uid 500); 7 Aug 2009 03:52:21 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 76185 invoked by uid 99); 7 Aug 2009 03:52:20 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 07 Aug 2009 03:52:20 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [206.190.38.59] (HELO web50305.mail.re2.yahoo.com) (206.190.38.59) by apache.org (qpsmtpd/0.29) with SMTP; Fri, 07 Aug 2009 03:52:09 +0000 Received: (qmail 31445 invoked by uid 60001); 7 Aug 2009 03:51:48 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1249617108; bh=BPOCbMNmWpipoVlPn0WIWLpzeNpgOIutQ9MHSxq/SXg=; h=Message-ID:X-YMail-OSG:Received:X-Mailer:References:Date:From:Subject:To:In-Reply-To:MIME-Version:Content-Type; b=hI5b5axMXg8qjcTXxOu0OCrsWfmw9/JNzMZTLbKAknEwnBXfwaKBieDMdpBuuWpddUJDS1wNjQQjb2gS9S7AzU4mKgtlPSnSTbgWy7f7jd10QJQFMGJbK9EJhTwWDNKMiU0vBHPyb5LjEeJlbEqaNofMa9yaB3rVsJangrU9XOk= DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=Message-ID:X-YMail-OSG:Received:X-Mailer:References:Date:From:Subject:To:In-Reply-To:MIME-Version:Content-Type; b=s0zvYp64N9Fvr0mL76U5LTtwik0qaTqG6V5QoESPo2/cyHD7p2vW86sOfOzQ8aUdMF653KWFirof3VdzQ4J+plGR/M2fhyRcqRfph4PsPVPVr/Dz0MW8T1LbPTwRSVk6e+8zNO53Zn22efVhGDigp9vJKRiLH2S/RfoZTizohx0=; Message-ID: <648018.31297.qm@web50305.mail.re2.yahoo.com> X-YMail-OSG: 5fMLhbsVM1nhLBA1Xod.jDOOTiQ3qa.3WoZzmySdf0Inzh9zpqcsCcu7PgiN52e06p1ah6f_Hv8Q4Nxzh.Q9E4Mf.lsd..o710lFZ6CfxqbR1rmgiY0hSS2Xq3YIvS7nuHecSE6z8xL7D3lJhGgHHoyyzFHfvhTllzk2.oVgIZrCb.2YotOHfJ9_iOHVWctW_s7AYmqMYEw217KfT5d7Cm3waBwRHjo2dXNNAZZ54ETYFAfPQ.dKN3f1ZflauxMrxs8C31SKjJbLvrMiGN67QD6.EDxXeQChiCCL1ZECy1ZYgh20PonWD24IvdnMAXXmTrrhrX1tmHRaIviaEgVm7pcvHoU9A7FgUfU8mg-- Received: from [74.73.15.78] by web50305.mail.re2.yahoo.com via HTTP; Thu, 06 Aug 2009 20:51:48 PDT X-Mailer: YahooMailRC/1358.27 YahooMailWebService/0.7.338.2 References: <860544ed0908061246h49485a65se5b1acc5719343e9@mail.gmail.com> Date: Thu, 6 Aug 2009 20:51:48 -0700 (PDT) From: Otis Gospodnetic Subject: Re: Language Detection for Analysis? To: java-user@lucene.apache.org, solr-user@lucene.apache.org In-Reply-To: <860544ed0908061246h49485a65se5b1acc5719343e9@mail.gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Virus-Checked: Checked by ClamAV on apache.org Bradford, If I may: Have a look at http://www.sematext.com/products/language-identifier/index.html And/or http://www.sematext.com/products/multilingual-indexer/index.html Otis -- Sematext is hiring -- http://sematext.com/about/jobs.html?mls Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR ----- Original Message ---- > From: Bradford Stephens > To: solr-user@lucene.apache.org; java-user@lucene.apache.org > Sent: Thursday, August 6, 2009 3:46:21 PM > Subject: Language Detection for Analysis? > > Hey there, > > We're trying to add foreign language support into our new search > engine -- languages like Arabic, Farsi, and Urdu (that don't work with > standard analyzers). But our data source doesn't tell us which > languages we're actually collecting -- we just get blocks of text. Has > anyone here worked on language detection so we can figure out what > analyzers to use? Are there commercial solutions? > > Much appreciated! > > -- > http://www.roadtofailure.com -- The Fringes of Scalability, Social > Media, and Computer Science > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org