From java-dev-return-12866-apmail-lucene-java-dev-archive=lucene.apache.org@lucene.apache.org Tue Feb 21 05:47:48 2006 Return-Path: Delivered-To: apmail-lucene-java-dev-archive@www.apache.org Received: (qmail 54918 invoked from network); 21 Feb 2006 05:47:47 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 21 Feb 2006 05:47:47 -0000 Received: (qmail 27644 invoked by uid 500); 21 Feb 2006 05:47:35 -0000 Delivered-To: apmail-lucene-java-dev-archive@lucene.apache.org Received: (qmail 27619 invoked by uid 500); 21 Feb 2006 05:47:35 -0000 Mailing-List: contact java-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-dev@lucene.apache.org Delivered-To: mailing list java-dev@lucene.apache.org Received: (qmail 27606 invoked by uid 99); 21 Feb 2006 05:47:35 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 20 Feb 2006 21:47:35 -0800 X-ASF-Spam-Status: No, hits=2.3 required=10.0 tests=DNS_FROM_RFC_ABUSE,DNS_FROM_RFC_WHOIS,FORGED_YAHOO_RCVD X-Spam-Check-By: apache.org Received-SPF: pass (asf.osuosl.org: local policy) Received: from [206.190.39.215] (HELO web50313.mail.yahoo.com) (206.190.39.215) by apache.org (qpsmtpd/0.29) with SMTP; Mon, 20 Feb 2006 21:47:34 -0800 Received: (qmail 89892 invoked by uid 60001); 21 Feb 2006 05:47:13 -0000 DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=Message-ID:Date:From:Reply-To:Subject:To:In-Reply-To:MIME-Version:Content-Type; b=pvc1IDsXehdtd7l/4QLELDNatrAptVbczbhR0nK/QdF7jeTWnGg7/EcjHemuaTJ0iSfr9jYboCI/e75TYRJ5chTC4irDm+5pO18wtdza3tHGwJhoRo1zjzuJafrKuCt+HQct70coNTEmdJNC8114R1aObGrVpZZTNbM+pkhouKg= ; Message-ID: <20060221054713.89890.qmail@web50313.mail.yahoo.com> Date: Mon, 20 Feb 2006 21:47:13 -0800 (PST) From: Otis Gospodnetic Reply-To: Otis Gospodnetic Subject: Re: TermVector usage To: java-dev@lucene.apache.org In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N Hi Marvin, As far as I can tell, most people use TermVectors for "more like this" queries (see MoreLikeThis class in contrib/ somewhere) Otis ----- Original Message ---- From: Marvin Humphrey To: java-dev@lucene.apache.org Sent: Mon 20 Feb 2006 10:28:56 PM EST Subject: TermVector usage Greets, KinoSearch 0.05, which for now I'm calling a "loose port" of Lucene, was published to CPAN a few weeks ago. It's nice and fast, but missing some features, most notably multiple segment support and incremental indexing. Before I get to that though, I'm adding excerpting and highlighting. The version of KinoSearch which preceded the Lucene-based rewrite also had a highlighter which depended on what were effectively TermVectors with stored offsets. However, unlike Lucene, these were stored along with the stored fields. As I've been preparing to port all the support apparatus for TermVectors, I've been wondering whether I shouldn't go back to that. It sure would be less work to code up. Theoretically there ought to be less disk activity, too. From following the Lucene lists off and on, I've gotten the impression that lots of people use TermVectors to feed the highlighter, but I haven't seen many applications for them besides that. LSI-type ideas percolate every once in a while. Besides highlighting, how many people are using TermVectors and how are they using them? Marvin Humphrey Rectangular Research http://www.rectangular.com/ --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org For additional commands, e-mail: java-dev-help@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org For additional commands, e-mail: java-dev-help@lucene.apache.org