Return-Path: Delivered-To: apmail-jakarta-lucene-dev-archive@www.apache.org Received: (qmail 50017 invoked from network); 8 Nov 2003 07:28:10 -0000 Received: from daedalus.apache.org (HELO mail.apache.org) (208.185.179.12) by minotaur-2.apache.org with SMTP; 8 Nov 2003 07:28:10 -0000 Received: (qmail 17459 invoked by uid 500); 8 Nov 2003 07:27:45 -0000 Delivered-To: apmail-jakarta-lucene-dev-archive@jakarta.apache.org Received: (qmail 17427 invoked by uid 500); 8 Nov 2003 07:27:45 -0000 Mailing-List: contact lucene-dev-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Lucene Developers List" Reply-To: "Lucene Developers List" Delivered-To: mailing list lucene-dev@jakarta.apache.org Received: (qmail 17414 invoked from network); 8 Nov 2003 07:27:45 -0000 Received: from unknown (HELO smtp-out.quicknet.nl) (213.73.255.38) by daedalus.apache.org with SMTP; 8 Nov 2003 07:27:45 -0000 Received: from vmx10.multikabel.net (vmx10.multikabel.net [212.127.254.136]) by mta1.priv.quicknet.nl (iPlanet Messaging Server 5.2 HotFix 1.21 (built Sep 8 2003)) with ESMTP id <0HO000H2XVEI5O@mta1.priv.quicknet.nl> for lucene-dev@jakarta.apache.org; Sat, 08 Nov 2003 08:27:56 +0100 (MET) Received: from whale (qn-213-231-215-88.quicknet.nl [213.231.215.88]) by vmx10.multikabel.net (8.12.8/8.12.8) with SMTP id hA87Ro92020059 for ; Sat, 08 Nov 2003 08:27:50 +0100 Date: Sat, 08 Nov 2003 08:30:34 +0100 From: maurits van wijland Subject: Re: Java TextCat 0.1 To: Lucene Developers List Message-id: <285901c3a5ca$332893e0$0200a8c0@whale> MIME-version: 1.0 X-MIMEOLE: Produced By Microsoft MimeOLE V6.00.2800.1165 X-Mailer: Microsoft Outlook Express 6.00.2800.1158 Content-type: text/plain; charset=iso-8859-1 Content-transfer-encoding: 7BIT X-Priority: 3 X-MSMail-priority: Normal X-MultiKabel-MailScanner-Information: Please contact helpdesk@quicknet.nl for more information X-MultiKabel-MailScanner: Found to be clean References: <20031107023139.F27389@incze.adsl.enternet.hu> <283601c3a567$44f5f490$0200a8c0@whale> <428001c3a569$76ae9d50$0200a8c0@joseph> X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N X-Spam-Rating: minotaur-2.apache.org 1.6.2 0/1000/N Pete, It's because I think of search engine as a guided search engine. They should offer the 'end-user' help when trying to find information. So a drop-down should not be included into the search interface. Ofcourse a drop down is a good method to choose a query language. Are the different languages in different indexes or are they all combined into one? chrs, Maurits ----- Original Message ----- From: "Pete Lewis" To: "Lucene Developers List" Sent: Friday, November 07, 2003 8:58 PM Subject: Re: Java TextCat 0.1 > Hi Maurits > > Language guessing is OK for documents where you have a fair amount of text > to play with; search clues however are much shorter - often just a word or > two. Therefore why don't you have a default query language and then just > have a drop-down box to let the user select the query language if different > from the default. > > Cheers > > Pete > > ----- Original Message ----- > From: "maurits van wijland" > To: "Lucene Developers List" > Sent: Friday, November 07, 2003 7:12 PM > Subject: Re: Java TextCat 0.1 > > > > Hi all, > > > > Incze, do you choose the analyer when indexing and seraching? how? > > Can you send an example code? > > > > I have tried this with a naive bayes language guesser, but the problem i > > found is that whren searching, the query words are to 'small' to > accurately > > predict a language... > > > > So, how do you manage? > > > > kind regards, > > > > Maurits van Wijland > > > > > > ----- Original Message ----- > > From: "Incze Lajos" > > To: "Lucene Developers List" > > Sent: Friday, November 07, 2003 2:31 AM > > Subject: Re: Java TextCat 0.1 > > > > > > > On Thu, Nov 06, 2003 at 02:14:11PM +0100, Patrick Debois wrote: > > > > Java interfacing with libtextcat. Might be of interest for you > > (according > > > > to the mailling lists) > > > > > > > > I've used it for choosing the correct analyzer in Lucene Snowball > > > > > > > > I will provide it on my website http://www.jedi.be/JTextCat/index.html > > > > > > > > Hope it does not violate any copyrights. > > > > > > > > --------------------------------------------------------------------- > > > > > > Have you seen this project? > > > > > > http://ngramj.sourceforge.net/ > > > > > > (Pure java N-Gram lib, with a sample servlet.) > > > > > > incze > > > > > > --------------------------------------------------------------------- > > > To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org > > > For additional commands, e-mail: lucene-dev-help@jakarta.apache.org > > > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org > > For additional commands, e-mail: lucene-dev-help@jakarta.apache.org > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org > For additional commands, e-mail: lucene-dev-help@jakarta.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org For additional commands, e-mail: lucene-dev-help@jakarta.apache.org