lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Morus Walter <morus.wal...@gmx.de>
Subject RE: multivalue fields
Date Sat, 15 May 2004 07:48:17 GMT
Ryan Sonnek writes:
> using lucene 1.3-final, it appears to only search the first field with that name.  here's
the code i'm using to construct the index, and I'm using Luke to check that the index is created
correctly.  Everything looks fine, but my search returns empty.  do i have to use a special
query to work with multivalue fields?  is there a testcase in the source that performs this
kind of work that I could look at?

Don't know what goes wrong on your side, but this works just fine.
Maybe your fields are too long so that only part of it gets indexed
(look at IndexWriter.maxFieldLength).

A test program
      
import org.apache.lucene.document.*;
import org.apache.lucene.analysis.*;
import org.apache.lucene.index.*;
import org.apache.lucene.store.*;
import org.apache.lucene.search.*;
import org.apache.lucene.queryParser.QueryParser;

class LuceneTest 
{
    static String[] docs = {
	"a c", 
	"b d",
 
	"c e", 
	"d f", 

    };

    static String[] queries = {
	"a",
	"b",
	"c",
	"d",
 	"b OR c"
    };

    public static void main(String argv[]) throws Exception {
	Directory dir = new RAMDirectory();
	String[] stop = {};
	Analyzer analyzer = new StandardAnalyzer(stop);
	IndexWriter writer = new IndexWriter(dir, analyzer, true);

	// index documents (2 fields text each)
	for ( int i=0; i < docs.length; i+=2 ) {
	    Document doc = new Document();
	    doc.add(Field.Text("text", docs[i]));
	    doc.add(Field.Text("text", docs[i+1]));
	    writer.addDocument(doc);
	}
	writer.close();

	Searcher searcher = new IndexSearcher(dir);
	for ( int i=0; i < queries.length; i++ ) {
	    Query query = QueryParser.parse(queries[i], "text", analyzer);
	    Hits hits = searcher.search(query);
	    System.out.println("Query: " + query.toString("text"));
	    System.out.println("      " + hits.length() + " documents found");
	    for ( int j=0; j < hits.length(); j++ ) {
		Document doc = hits.doc(j);
		System.out.println("\t"+hits.id(j) + ": " + doc.get("text") + "\t    " + hits.score(j));
		//System.out.println("  "+ searcher.explain(query, hits.id(j)));
	    }
	}
    }
}

shows that search takes place in both fields.
Query: a
      1 documents found
        0: b d      0.5
Query: b
      1 documents found
        0: b d      0.5
Query: c
      2 documents found
        0: b d      0.2972674
        1: d f      0.2972674
Query: d
      2 documents found
        0: b d      0.2972674
        1: d f      0.2972674
Query: b c
      2 documents found
        0: b d      0.581694
        1: d f      0.0759574
	

But note, that this affects scoring as concatenation would.
So I think Otis answer is a bit missleading.
If you don't want the effects on scoring you AFAIK need to use different 
documents or fields.

Morus

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message