lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <oh...@cox.net>
Subject Re: New to Lucene - some questions about demo
Date Tue, 28 Jul 2009 17:04:12 GMT
Matthew,

I'll keep your comments in mind, but I'm still confused about something.

I currently haven't changed much in the demo, other than adding that doc.add for "summary".

With JUST that doc.add, having done my reading, I kind of expected NOT to be able to search
on the "summary" at all, but it kind of seems like SOMETIMES, I am still getting responses
when I search on something in "summary".  

Does that mean that Lucene will automatically do multi-field searching?

Maybe I've been up too long, but it seems like, for example, when I search on "summary:foofoo"
I am not getting a response, but, for example, if I search on:

summary:foofoo AND contents:test1

I get results in the search response.

Since I haven't yet added the MultiField query, shouldn't it ONLY be searching on the "contents"
field (because the "summary:foofo" should have been false, and because I am using an AND)?

Like I said, maybe I've been staring at this too long, and need to do some more structured
testing :)...

Sorry.

Later,
Jim




---- Matthew Hall <mhall@informatics.jax.org> wrote: 
> You can choose to do either,
> 
> Having items in multiple fields allows you to apply field specific 
> boosts, thusly making matches to certain fields more important to others.
> 
> But, if that's not something that you care about the second technique is 
> useful in that it vastly simplifies your index structure (And thusly 
> your query structure)
> 
> So, it depends on what you want to be able to do in the end.  Do you 
> envision doing something like being able to search by the summary and 
> the contents at the same time, but weighing hits to the summary as a 
> higher priority?
> If so, use multiple fields.  If not, keep this first iteration in lucene 
> simple, and compress everything down.  Also please note that the + " " + 
> in the example cited is important.  That space will ensure that your 
> contents and summary fields will be tokenized properly. (Just in case 
> they are single words lets say).
> 
> Matt
> 
> 
> 
> ohaya@cox.net wrote:
> > Hi Matthew and Ian,
> >
> > Thanks, I'll try that, but, in the meantime, I've been doing some reading (Lucene
in Action), and on pg. 159, section 5.3, it discusses "Querying on multiple fields".  
> >
> > I was just about to try to what's described in that section, i.e., using MultiFieldQueryParser.parse(),
or, as another note on pg. 161 mentions, doing something like:
> >
> > doc.add(Field.Unstored("contents", contents + " " + summary);
> >
> > So, I guess I'm a little confused (happens a lot :)!):  In the situation I'm talking
about (starting with the Lucene demo and demo webapp, and trying to be able to index and search
more than just the "contents" field), do I not need to use the MultiFieldQueryParser.parse()
or do what they call "create a synthentic content"?
> >
> > Thanks,
> > Jim
> >
> >
> > ---- Matthew Hall <mhall@informatics.jax.org> wrote: 
> >   
> >> Yeah, Ian has it nailed on the head here.
> >>
> >> Can't believe I missed it in the initial writeup.
> >>
> >> Matt
> >>
> >> Ian Lea wrote:
> >>     
> >>> Jim
> >>>
> >>>
> >>> Glancing at SearchFiles.java I can see
> >>>
> >>> Analyzer analyzer = new StandardAnalyzer();
> >>> ...
> >>> QueryParser parser = new QueryParser(field, analyzer);
> >>> ...
> >>> Query query = parser.parse(line);
> >>>
> >>> so any query term you enter will be run through StandardAnalyzer which
> >>> will, amongst other things, convert it to lowercase and will not match
> >>> the indexed value of FooFoo.  If you're just playing, it would
> >>> probably be easiest to tell lucene to analyze the summary field e.g.
> >>>
> >>> doc.add(new Field("summary", "FooFoo", Field.Store.YES, Field.Index.ANALYZED));
> >>>
> >>> That will cause FooFoo to be indexed as foofoo and thus should be
> >>> matched on search.
> >>>
> >>>
> >>> --
> >>> Ian.
> >>>
> >>>
> >>> On Tue, Jul 28, 2009 at 2:22 PM, <ohaya@cox.net> wrote:
> >>>   
> >>>       
> >>>> Ian and Matthew,
> >>>>
> >>>> I've tried "foofoo", "summary:foofoo", "FooFoo", and "summary:FooFoo".
 No results returned for any of those :(.
> >>>>
> >>>> Also, Matthew, I bounced Tomcat after running IndexFiles, so I don't
think that's the problem either :(...
> >>>>
> >>>> I looked at the SearchFiles.java code, and it looks like it's literally
using whatever query string I'm entering (ditto for luceneweb).  Is there something with the
query itself that needs to be modified to support searching on the fields other than the "contents"
field (recall, I'm pretty sure that all those other fields are in the index, via Luke)?
> >>>>
> >>>> Jim
> >>>>
> >>>>
> >>>>
> >>>> ---- Ian Lea <ian.lea@gmail.com> wrote:
> >>>>     
> >>>>         
> >>>>> Hi
> >>>>>
> >>>>>
> >>>>> Field.Index.NOT_ANALYZED means it will be stored as is i.e. "FooFoo"
> >>>>> in your example, and if you search for "foofoo" it won't match.
 A
> >>>>> search for "FooFoo" would, assuming that your search terms are not
> >>>>> being lowercased.
> >>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> Ian.
> >>>>>
> >>>>>
> >>>>> On Tue, Jul 28, 2009 at 1:56 PM, Ohaya<ohaya@cox.net> wrote:
> >>>>>       
> >>>>>           
> >>>>>> Hi,
> >>>>>>
> >>>>>> I'm just starting to work with Lucene, and I guess that I learn
best by
> >>>>>> working with code, so I've started with the demos in the Lucene
> >>>>>> distribution.
> >>>>>>
> >>>>>> I got the IndexFiles.java and IndexHTML.java working, and also
the
> >>>>>> luceneweb.war is deployed to Tomcat.
> >>>>>>
> >>>>>> I used IndexFiles.java to index some text files, and then used
both the
> >>>>>> SearchFiles.java and the luceneweb web app to do some testing.
> >>>>>>
> >>>>>> One of the things that I noticed with the luceneweb web app
is that when I
> >>>>>> searched, the search results returned "Summary" of "null", so
I added:
> >>>>>>
> >>>>>> doc.add(new Field("summary", "FooFoo", Field.Store.YES,
> >>>>>> Field.Index.NOT_ANALYZED));
> >>>>>>
> >>>>>> to the IndexFiles.java, and ran it again.
> >>>>>>
> >>>>>> I had expected that I'd then be able to do a search for something
like
> >>>>>> "summary:foofoo", but when I did that, I got no results.
> >>>>>>
> >>>>>> I also tried SearchFiles.java, and again got no results.
> >>>>>>
> >>>>>> I tried using Luke, and that is showing that the "summary" field
is in the
> >>>>>> indexes, so I'm wondering why I am not able to search on other
fields such
> >>>>>> as "summary", "path", etc.?
> >>>>>>
> >>>>>> Can anyone explain what else I need to do, esp. in the luceneweb
web app, to
> >>>>>> be able to search these other fields?
> >>>>>>
> >>>>>> Thanks!
> >>>>>>
> >>>>>> Jim
> >>>>>>
> >>>>>>
> >>>>>> ---------------------------------------------------------------------
> >>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>>>>>
> >>>>>>
> >>>>>>         
> >>>>>>             
> >>>>> ---------------------------------------------------------------------
> >>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>>>>
> >>>>>       
> >>>>>           
> >>>>     
> >>>>         
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >>> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>>
> >>>   
> >>>       
> >> -- 
> >> Matthew Hall
> >> Software Engineer
> >> Mouse Genome Informatics
> >> mhall@informatics.jax.org
> >> (207) 288-6012
> >>
> >>
> >>     
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >   
> 
> 
> -- 
> Matthew Hall
> Software Engineer
> Mouse Genome Informatics
> mhall@informatics.jax.org
> (207) 288-6012
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message