Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 82525 invoked from network); 8 Apr 2009 17:50:22 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 8 Apr 2009 17:50:22 -0000 Received: (qmail 43805 invoked by uid 500); 8 Apr 2009 17:50:20 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 43742 invoked by uid 500); 8 Apr 2009 17:50:20 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 43732 invoked by uid 99); 8 Apr 2009 17:50:20 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 08 Apr 2009 17:50:20 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of karl.wettin@gmail.com designates 72.14.220.159 as permitted sender) Received: from [72.14.220.159] (HELO fg-out-1718.google.com) (72.14.220.159) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 08 Apr 2009 17:50:12 +0000 Received: by fg-out-1718.google.com with SMTP id l27so547794fgb.4 for ; Wed, 08 Apr 2009 10:49:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:from:to :in-reply-to:content-type:content-transfer-encoding:mime-version :subject:date:references:x-mailer; bh=T9odhaictHN5nPBxHUrysWZ3dc2E+vGMoiGpTOjCJfc=; b=MGxshPflzozxpl+qNMK8T/vakGHtRz5Nod8hy50hQVQwtUflzKnip8OKnDxJbPIg6Z rGGE0/Un24107pP9tTeJa63VioxFfI2ja9pOnn4KEFxVqTOWe89A87C4hwlw4oAgC5cx 4Xclj7i2GqRa4AJYGYZR+lgsnkICEHZnpo0hg= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:from:to:in-reply-to:content-type :content-transfer-encoding:mime-version:subject:date:references :x-mailer; b=BkhZ5uNGERLeC03GuCsWLNKjj2lpaIHFiv1hICfca9QGCGgWLJgpsTtJ7QVHxhzmZ5 pCwps2JbTcHhaKJw8iIXe8RdOoK/o2mhYHNzyyYsWvJzzFTJpvSXQj3tYMamExeMi7OX L12HCjVX6kjeTZKVgVmKixP91jvwwcN5xj528= Received: by 10.86.49.13 with SMTP id w13mr1284287fgw.10.1239212990615; Wed, 08 Apr 2009 10:49:50 -0700 (PDT) Received: from Elf-Ulfving.midzone ([84.243.3.2]) by mx.google.com with ESMTPS id l19sm2769144fgb.16.2009.04.08.10.49.49 (version=TLSv1/SSLv3 cipher=RC4-MD5); Wed, 08 Apr 2009 10:49:50 -0700 (PDT) Message-Id: <790B06BE-9161-4BF8-B037-28D59743C76E@gmail.com> From: Karl Wettin To: java-user@lucene.apache.org In-Reply-To: Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v930.3) Subject: Re: Suggestive Search Date: Wed, 8 Apr 2009 19:49:45 +0200 References: X-Mailer: Apple Mail (2.930.3) X-Virus-Checked: Checked by ClamAV on apache.org For this you probably want to use ngrams. Wether or not this is something that fits in your current index is hard to say. My guess is that you want to create a new index with one document per unique phrase. You might also want to try to load this index in an InstantiatedIndex, that could speed things up quite a bit if the corpus is not too large. If your suggestion text corpus is really large and you only want forward-only suggestions then you might want to consider a trie- pattern solution instead. These can be rather resource efficient, even when loaded to memory. If you have a lot of user load on your search eninge then it might be interesting to use old user queries as the base of your suggestions and perhaps boost a bit on trends, i.e. the more people search for something the more it get boosted in the suggestions list. karl 8 apr 2009 kl. 15.26 skrev Matt Schraeder: > I want to add a suggestive search similar to google's to autocomplete > search phrases as the user types. It doesn't have to be very > elaborate > and for the most part will just involve searching single fields. How > can I perform a search to be able to fill in autocomplete text? > > For instance, if I start typing "Harr" it should bring up "Harry > Potter" "Harry Houdini" and "Harry S. Truman" > > I have tried doing search queries for "Harr*" but it's still doing > term-based searching rather than searching a full field. To make a > field both searchable as the full field as well as tokenized, would I > have to duplicate the field and make one a keyword field? Is there a > more convenient way to do this? I have also considered making a second > index for suggestive search, which would only have the fields that I > want to enable suggestive search on, but this seems like it would be > unneccesary duplication of data as well, though it would probably make > suggestive search faster due to a smaller index. > > Ideally it would also be nice to be able to rank these terms based on > the number of times they have been searched for so that the results > are > tailored more to our users rather than simply just the score that > Lucene > chooses. --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org