lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson" <erickerick...@gmail.com>
Subject Re: regarding FieldSelector
Date Thu, 13 Sep 2007 18:45:07 GMT
I'm not entirely sure. So what I'd do if I were you is write a
little test program and step through it in the debugger and
see <G>....

But, if you're only allowing the user to fetch a single document
at a time, I don't think it matters enough to worry about. If, on the
other hand, you're allowing the user to display some combination
of, say, 5 fields for a *list* of documents, I'd make them all lazy
and then you can write a HitCollector to get the list "lazily".

Best
Erick

On 9/13/07, Mohammad Norouzi <mnrz57@gmail.com> wrote:
>
> well, actually, I have 5 index directory and it will increase in future.
> and
> the thing is that each document about 20 fields on average. considering
> many
> users may connect to the system (we anticipate 500 users at this time) I
> want to know whether this will make performance issue or not.
>
> we provided a feature to select which fields they want to be displayed so
> I
> know that only 5 or 6 fields are important to my users. I don't know the
> way
> I stated in my last email, I mean searcher.doc(doc_id).get("field_name"),
> make the Lucene to load all fields of the document or only the given name?
> if yes, I mean if all the fields are loaded I think it's better to make
> them
> lazy.
>
> what do you suggest?
>
> thanks
>
>
> On 9/13/07, Erick Erickson <erickerickson@gmail.com> wrote:
> >
> > Do you have any evidence that you're having a performance issue? If
> > not, I'd just do the simple thing and ignore the rest. The performance
> > issues I found were because I was spinning through many, many
> > documents. If you're only worrying about one document at a time,
> > it may not be an issue.
> >
> > If you *are* having performance issues, I'd *strongly* recommend
> > that you measure to find out where the problem is before trying
> > a solution. Otherwise you'll optimize code that isn't the problem.
> >
> > Best
> > Erick
> >
> > On 9/13/07, Mohammad Norouzi <mnrz57@gmail.com> wrote:
> > >
> > > Thanks
> > > as I saw the documents, we can only use this great field selector in
> > > IndexReader.document() method the problem is I have a Searcher in my
> > > result
> > > set structure and when the client calls getString("a_field_name") at
> > that
> > > time I invoke the searcher.doc(current_doc_id).get("a_field_name),
> > > I already collected the result IDs. so in my case, I can't use
> > > FieldSelector.
> > >
> > > Do I have to revise the way of retrieving documents in my code?
> > >
> > >
> > >
> > > On 9/12/07, Erick Erickson <erickerickson@gmail.com> wrote:
> > > >
> > > > Well, it depends on what "improve the search process" means
> > > > in your context <G>..
> > > >
> > > > But I had a case similar to yours that I wrote up in the Wiki where
> > > > my search times improved about 10X by using lazy loading. You
> > > > might want to read that entry here...
> > > >
> > > > http://wiki.apache.org/lucene-java/FieldSelectorPerformance
> > > >
> > > > Note the peculiar characteristics of my data set, I really suspect
> > > > that a 10x improvement in retrieval speed is atypical...
> > > >
> > > > As for when lazily-loaded fields actually get loaded, I didn't
> really
> > > > have to explore it very fully, but a short experiment should do it
> > > > for you.....
> > > >
> > > > Best
> > > > Erick
> > > >
> > > > On 9/12/07, Mohammad Norouzi <mnrz57@gmail.com> wrote:
> > > > >
> > > > > Hi Grant,
> > > > > Really thanks for your nice document about advanced Lucene. it was
> > > very
> > > > > useful for me.
> > > > >
> > > > > as I understand, we can set some large fields to be lazily
> loading,
> > > now
> > > > my
> > > > > question is when it will be loaded? it make sense when we call
> > > > > doc.get("field_name")
> > > > > it will load from the index, Am I right?
> > > > >
> > > > > in my application, I've provided a result set structure to
> navigate
> > > > > between
> > > > > results and documents and provide a get(String fieldname) method
> > just
> > > > like
> > > > > java.sql.ResultSet.getString() method, and also this result set
> > > > implements
> > > > > HitCollector in order to collect my own ID rather than Lucene's
> > > document
> > > > > id,
> > > > > so I think I can set my field ID to be loaded always and the other
> > > > fields
> > > > > to
> > > > > be lazily loading, Does this improve the search process?
> > > > >
> > > > > again, thank you very much indeed.
> > > > >
> > > > >
> > > > > On 9/12/07, Grant Ingersoll <gsingers@apache.org> wrote:
> > > > > >
> > > > > > Hi Mohammad,
> > > > > >
> > > > > > The typical use cases are:
> > > > > > 1. You have several small fields used in a results display and
> one
> > > or
> > > > > > two large fields (i.e. the original document) and you don't
want
> > to
> > > > > > pay the cost of loading the large fields for results display
> > because
> > > > > > most of them won't be chosen.  When a result is chosen, the
> lazily
> > > > > > loaded field will be retrieved.
> > > > > >
> > > > > > 2. You only want to load certain fields, or the first field,
or
> > you
> > > > > > just want to know the size of a field.
> > > > > >
> > > > > > Basically, it gives you control over how fields are loaded from
> > disk
> > > > > > in Lucene.
> > > > > >
> > > > > > See my ApacheCon Europe presentation
> > http://cnlp.org/presentations/
> > > > > > slides/AdvancedLuceneEU.pdf for a few slides (towards the end)
> on
> > > > > > FieldSelector.
> > > > > >
> > > > > > On Sep 12, 2007, at 5:13 AM, Mohammad Norouzi wrote:
> > > > > >
> > > > > > > Hi all,
> > > > > > >
> > > > > > > Can anyone explain what is the FieldSelector and the usage
or
> > > > > > > benefits of
> > > > > > > this structure? I read the javadocs but I can't get for
what
> > goal
> > > > > > > it is
> > > > > > > provided in Lucene.
> > > > > > >
> > > > > > > Thanks in advance
> > > > > > >
> > > > > > > --
> > > > > > > Regards,
> > > > > > > Mohammad
> > > > > > > --------------------------
> > > > > > > see my blog: http://brainable.blogspot.com/
> > > > > > > another in Persian: http://fekre-motefavet.blogspot.com/
> > > > > >
> > > > > > --------------------------
> > > > > > Grant Ingersoll
> > > > > > http://lucene.grantingersoll.com
> > > > > >
> > > > > > Lucene Helpful Hints:
> > > > > > http://wiki.apache.org/lucene-java/BasicsOfPerformance
> > > > > > http://wiki.apache.org/lucene-java/LuceneFAQ
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > ---------------------------------------------------------------------
> > > > > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > > > > For additional commands, e-mail:
> java-user-help@lucene.apache.org
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Regards,
> > > > > Mohammad
> > > > > --------------------------
> > > > > see my blog: http://brainable.blogspot.com/
> > > > > another in Persian: http://fekre-motefavet.blogspot.com/
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Regards,
> > > Mohammad
> > > --------------------------
> > > see my blog: http://brainable.blogspot.com/
> > > another in Persian: http://fekre-motefavet.blogspot.com/
> > >
> >
>
>
>
> --
> Regards,
> Mohammad
> --------------------------
> see my blog: http://brainable.blogspot.com/
> another in Persian: http://fekre-motefavet.blogspot.com/
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message