lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <otis_gospodne...@yahoo.com>
Subject Re: Too few search results
Date Wed, 05 Feb 2003 16:19:41 GMT
This is really a FAQ (prefix queries and case sensitivity).
StandardAnalyzer converts all tokens to lower case before indexing, no?
You are using upper case when using prefix queries.
See?

Regarding your indexing snippet - Exception could be SQLException, so
couldn't your 'conn' be null, therefore causing NPE in the finally
block?
And what happens with your writer.close() if you get an Exception?  You
could put writer.close() in finally block, too, maybe, checking for
null, etc.

Otis


--- Marcel_Stör <marcel@frightanic.com> wrote:
> Hi all,
> 
> My Lucene IndexSearcher returns too few hits when I use some extended
> query syntaxt. I'll give examples of my query/hits pairs at the
> bottom.
> I'm indexing a database table:
> 
> <CODE>
> //creating a full index
> DBConnection conn = null;
> try {
>     Analyzer analyzer = new StandardAnalyzer();
>     IndexWriter writer = new IndexWriter(indexFile, analyzer, true);
>     conn =
> DBConnectionFactory.getDBConnetion(DBConnectionFactory.ORACLE);
>     String query = "select id, category, problem, solution from
> kedb.solution";
>     conn.open(cred);
>     ResultSet rs = conn.execute(query);
>     while (rs.next()) {
>         Document doc = new Document();
>         doc.add(Field.Text("id", rs.getString(1)));
>         doc.add(Field.Text("category", rs.getString(2)));
>         doc.add(Field.Text("problem", rs.getString(3)));
>         doc.add(Field.UnStored("solution", rs.getString(4)));
>         writer.addDocument(doc);
>     }
>     writer.close();
>     conn.close();
> } catch (Exception ex) {
>     ex.printStackTrace();
> } finally {
>     conn.close();
> }
> 
> //searching the index
> Analyzer analyzer = new StandardAnalyzer();
> Searcher searcher = new IndexSearcher(IndexReader.open(indexFile));
> Query q = QueryParser.parse(query, "problem", analyzer);
> Hits  hits = searcher.search(q);
> for (int i = 0; i < hits.length(); i++) {
>     if (0 == cat.compareTo("") || 0 ==
> hits.doc(i).get("category").compareTo(cat)) {
>         ids.add(hits.doc(i).get("id"));
>         cats.add(hits.doc(i).get("category"));
>         probs.add(hits.doc(i).get("problem"));
>     }
> }
> </CODE>
> 
> This is sample data from the table. Right, it's German and not
> English!
> I tried GermanAnalyzer instead of StandardAnalyzer but that made the
> search results even worse.
> 
> ID;CATEGORY;PROBLEM;SOLUTION;
> 1;2;Irgendetwas mit dem Novell Server stimmt nicht;Server neu booten;
> 2;15;User kann sich nicht mehr am Novell anmelden. Kabel ist
> eingesteckt. Neubooten nützt nichts. Baum wird nicht gefunden.;Baum
> manuell suchen über Button auf Login-Maske.;
> 3;9;Makros in Word funktionieren nicht;Optionen geändert (Medium
> Level).
> Extras - Optionen - Makros;
> 4;5;Scanner funktioniert nicht. Gerät erscheint nicht in der Liste
> der
> USB Geräte.;Neusten Treiber installieren.
> VORSICHT: NT 4.0 unterstützt kein USB!!!;
> 5;16;Maus spinnt;Reinigen/Auswechseln;
> 6;16;Eheprobleme;Adresse einer Beratungsstelle vermitteln;
> 7;11;Browser bringt imme eine Sexseite als Startseite;Extras -
> Optionen
> - Startseite festlegen;
> 
> 
>   Query             /hits       /expected
> 1 Novell            /2          /2
> 2 Nove*             /0          /2
> 3 Novel~            /0          /2
> 4 N?vell            /0          /2
> 5 +Scanner +Baum    /0          /0
> 6 Scanner -USB      /0          /0
> 7 usb AND liste     /1          /1
> 8 Mikros~           /1          /1
> 9 Ehe*              /1          /0
> 10 "Word Optionen"~10/1          /0
> 11 "Browser Sexseite"~10/1       /1
> 
> 
> Boolean operators never seem to be a problem (ok, it's the easiest to
> implement ;-)). Fuzzy searches, wildcard searches, plus proximity
> searches just don't return enough hits. Very surprising are number 10
> &
> 11. At 10 the proximity fails but at 11 everything is fine!
> 
> If you have any tips as for how to improve the search process please
> let
> me know.
> 
> Regards,
> Marcel
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 


__________________________________________________
Do you Yahoo!?
Yahoo! Mail Plus - Powerful. Affordable. Sign up now.
http://mailplus.yahoo.com

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message