lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Meeraj Kunnumpurath <meeraj.kunnumpur...@asyska.com>
Subject Re: Search Ranking
Date Wed, 16 May 2012 20:50:38 GMT
The actual query is

Query q = new QueryParser(Version.LUCENE_35, "searchText",
analyzer).parse("Takeaway fred@company.com");

If I use

Query q = new QueryParser(Version.LUCENE_35, "searchText", analyzer).parse("
fred@company.com");

I get them in the reverse order.

Regards
Meeraj

On Wed, May 16, 2012 at 9:48 PM, Meeraj Kunnumpurath <
meeraj.kunnumpurath@asyska.com> wrote:

> I have tried the same using Lucene directly with the following code,
>
> import org.apache.lucene.store.RAMDirectory;
> import org.apache.lucene.document.Document;
> import org.apache.lucene.document.Field;
> import org.apache.lucene.index.IndexWriterConfig;
> import org.apache.lucene.util.Version;
> import org.apache.lucene.analysis.standard.StandardAnalyzer;
> import org.apache.lucene.index.IndexWriter;
> import org.apache.lucene.queryParser.QueryParser;
> import org.apache.lucene.index.IndexReader;
> import org.apache.lucene.search.IndexSearcher;
> import org.apache.lucene.search.Query;
> import org.apache.lucene.search.TopScoreDocCollector;
> import org.apache.lucene.search.ScoreDoc;
>
> public class LuceneTest {
>
>     public static void main(String[] args) throws Exception {
>
>         StandardAnalyzer analyzer = new
> StandardAnalyzer(Version.LUCENE_35);
>         RAMDirectory index = new RAMDirectory();
>         IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_35,
>                 analyzer);
>         IndexWriter indexWriter = new IndexWriter(index, config);
>
>         Document doc1 = new Document();
>         doc1.add(new Field("searchText", "ABC Takeaway fred@company.com
> fred@company.com", Field.Store.YES, Field.Index.ANALYZED));
>         Document doc2 = new Document();
>         doc2.add(new Field("searchText", "XYZ Takeaway fred@company.com",
> Field.Store.YES, Field.Index.ANALYZED));
>
>         indexWriter.addDocument(doc1);
>         indexWriter.addDocument(doc2);
>         indexWriter.close();
>
>         Query q = new QueryParser(Version.LUCENE_35, "searchText",
> analyzer).parse("Takeaway");
>
>         int hitsPerPage = 10;
>         IndexReader reader = IndexReader.open(index);
>         IndexSearcher searcher = new IndexSearcher(reader);
>         TopScoreDocCollector collector =
> TopScoreDocCollector.create(hitsPerPage, true);
>         searcher.search(q, collector);
>         ScoreDoc[] hits = collector.topDocs().scoreDocs;
>
>         System.out.println("Found " + hits.length + " hits.");
>         for(int i=0;i<hits.length;++i) {
>             int docId = hits[i].doc;
>             Document d = searcher.doc(docId);
>             System.out.println((i + 1) + ". " + d.get("searchText"));
>         }
>
>     }
>
> }
>
> The output is ..
>
> Found 2 hits.
> 1. XYZ Takeaway fred@company.com
> 2. ABC Takeaway fred@company.com fred@company.com
>
>
> On Wed, May 16, 2012 at 9:21 PM, Meeraj Kunnumpurath <
> meeraj.kunnumpurath@asyska.com> wrote:
>
>> Thanks Ivan.
>>
>> I don't use Lucene directly, it is used behind the scene by the Neo4J
>> graph database for full-text indexing. According to their documentation for
>> full text indexes they use white space tokenizer in the analyser. Yes, I do
>> get Listing 2 first now. Though if I exclude the term "Takeaway" from the
>> search string, and just put "fred@company.com", I get Listing 1 first.
>>
>> Regards
>> Meeraj
>>
>>
>> On Wed, May 16, 2012 at 8:49 PM, Ivan Brusic <ivan@brusic.com> wrote:
>>
>>> Use the explain function to understand why the query is producing the
>>> results you see.
>>>
>>>
>>> http://lucene.apache.org/core/3_6_0/api/core/org/apache/lucene/search/Searcher.html#explain(org.apache.lucene.search.Query
>>> ,
>>> int)
>>>
>>> Does your current query return Listing 2 first? That might be because
>>> of term frequencies. Which analyzers are you using?
>>>
>>> http://www.lucidimagination.com/content/scaling-lucene-and-solr#d0e63
>>>
>>> Cheers,
>>>
>>> Ivan
>>>
>>> On Wed, May 16, 2012 at 12:41 PM, Meeraj Kunnumpurath
>>> <meeraj.kunnumpurath@asyska.com> wrote:
>>> > Hi,
>>> >
>>> > I am quite new to Lucene. I am trying to use it to index listings of
>>> local
>>> > businesses. The index has only one field, that stores the attributes
>>> of a
>>> > listing as well as email addresses of users who have rated that
>>> business.
>>> >
>>> > For example,
>>> >
>>> > Listing 1: "XYZ Takeaway London fred@company.com barney@company.com
>>> > fred@company.com"
>>> > Listing 2: "ABC Takeaway London fred@company.com barney@company.com"
>>> >
>>> > Now when the user does a search with "Takeaway fred@company.com", how
>>> do I
>>> > get listing 1 to always come before listing 2, because it has the term
>>> > fred@company.com appear twice where as listing 2 has it only once?
>>> >
>>> > Regards
>>> > Meeraj
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message