Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 44220 invoked from network); 15 Nov 2010 23:32:22 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 15 Nov 2010 23:32:22 -0000 Received: (qmail 4496 invoked by uid 500); 15 Nov 2010 23:32:52 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 4440 invoked by uid 500); 15 Nov 2010 23:32:52 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 4432 invoked by uid 99); 15 Nov 2010 23:32:52 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 15 Nov 2010 23:32:52 +0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,RFC_ABUSE_POST,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [206.190.49.16] (HELO web52906.mail.re2.yahoo.com) (206.190.49.16) by apache.org (qpsmtpd/0.29) with SMTP; Mon, 15 Nov 2010 23:32:44 +0000 Received: (qmail 42665 invoked by uid 60001); 15 Nov 2010 23:32:22 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1289863942; bh=MG3/jXr1Zv07gSB6+hC8ehSOpk1NcbjeNDrtj0/0g80=; h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Subject:To:In-Reply-To:MIME-Version:Content-Type; b=pK171w6Y7Qh0esRWjjMIYQt4Jn7+kZAFaNtipBl7uINRJ7xF/+hX/upzukeqzE1BFGsNA4SvHydZRTQDNFv6T+Wi8DeOjdLOW6Bda8jxD4AovH9GRB+0CCn5GYJXLVJWo62EaZqZX3QgkVSWbeJoSJqdP52OcBAKTl94UJNFPJU= DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Subject:To:In-Reply-To:MIME-Version:Content-Type; b=U7l/bSjO1hzJXmnVl11VrGdLmKwDNbE7737kXUFK4Mr/pXHBbDR3Wy1jb0mo7OXs+Cp1s2iysaFcOwlhMFzA08fCvYJY8pgh8VbeXiY9GHvPmt7muwDakamGuDtqzM5eCU/f8OgmXth62zSDt3Lac1y1w0DJlP41LEXGzXQxYG0=; Message-ID: <841536.42106.qm@web52906.mail.re2.yahoo.com> X-YMail-OSG: w2RfVxkVM1l..hHJ7ZockQ2ojNo6mIlb5_2ztYddqP4FNLy iVRpDABatWNcY.UdZcQspXGNwvPuLDaGDnq8kA9ur8cMy9GhI5Mb6IeeCRmu vQnCknMOL8vDjhzW7L8MkXs76qtqjXT8NWttPErsozpLaFXnBid6HCZzgsFx SVx78ZDpwZBuCJ7g2_p4iKrp_K68MoEnCb_CyTXHhlzxkJNjroGR2CxMti_N pwtP6qxMwc18HR1vFcPoDgGmCtVp2twxT.3DhP7G3fP0mz_aCR5mwbEUiGF4 .8OFsA0CnB5juunREZNlv6n9oDMkZxN77a4rMh4pPs22imk4__pDvN9DKI_o c6DjMW4JHCDdQesFW5b7q979.iU7I5i1MXaid7jahTPwv Received: from [78.164.82.12] by web52906.mail.re2.yahoo.com via HTTP; Mon, 15 Nov 2010 15:32:22 PST X-Mailer: YahooMailClassic/11.4.9 YahooMailWebService/0.8.107.285259 Date: Mon, 15 Nov 2010 15:32:22 -0800 (PST) From: Ahmet Arslan Subject: Re: What is the best Analyzer and Parser for this type of question? To: java-user@lucene.apache.org In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Virus-Checked: Checked by ClamAV on apache.org > Example of Question: > - What is the role of PrnP in mad cow disease? First thing is do not directly query questions. Manually formulate queries: remove 'what' 'is' 'the' 'of' '?' etc. For example i would convert this question into: "mad cow"^5 "cow disease"^3 "mad cow disease"^15 "role PrnP"~5^2 "role mad cow disease"~45 mad^0.1 role^0.5 cow disease PrnP^10 > I am running in 11.638 documents and the result is 10410 > docs for this question (lowwwwww precision) Use OR default operator, collect and evaluate top 1000 documents only. And instead of Porter you can try KStem. http://ciir.cs.umass.edu/cgi-bin/downloads/downloads.cgi Try different length normalization described here. Also their Lucene query example (SpanNear) can inspire you. http://trec.nist.gov/pubs/trec16/papers/ibm-haifa.mq.final.pdf --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org