lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gabriele Kahlout <gabri...@mysimpatico.com>
Subject Re: Can I still search documents once updated?
Date Wed, 13 Jul 2011 12:05:24 GMT
On Wed, Jul 13, 2011 at 1:57 PM, Erick Erickson <erickerickson@gmail.com>wrote:

> Wait, you directly contradicted yourself <G>.... You say it's
> not stored, then you say it's stored and indexed, which is it?
>

ja, i meant indexed and not stored.


>
> When you fetch a document, only stored fields are returned
> and the returned data is the verbatim copy of the original
> data. No attempt is made to return un-stored fields. This
> has been the behavior allways. If you attempted to returned
> indexed but not stored data, you'd get stemmed versions,
> stop words would be removed, synonyms would be in place
> etc. Not to mention it would be very slow.
>

this is what i was expecting. Otherwise updating a field of a document that
has an unstored but indexed field is impossible (without losing the unstored
but indexed field. I call this updating a field of a document AND
deleting/updating all its unstored but indexed fields).

>
> If the field is stored, then there's another problem, you might
> want to dump the document after reading it from the IR.
>
> Best
> Erick
>
> On Wed, Jul 13, 2011 at 2:25 AM, Gabriele Kahlout
> <gabriele@mysimpatico.com> wrote:
> > It indeed is not stored, but this is still unexpected behavior. It's a
> > stored and indexed field, why has the index data been lost?
> >
> >
> > On Wed, Jul 13, 2011 at 12:44 AM, Erick Erickson <
> erickerickson@gmail.com>wrote:
> >
> >> Unless you stored your "content" field, the value you put in there won't
> >> be fetched from the index. Verify that the doc you retrieve from the
> index
> >> has values for "content", I bet it doesn't....
> >>
> >> Best
> >> Erick
> >>
> >> On Tue, Jul 12, 2011 at 9:38 AM, Gabriele Kahlout
> >> <gabriele@mysimpatico.com> wrote:
> >> >  @Test
> >> >    public void testUpdateLoseTermsSimplified() throws Exception {
> >> > *        IndexWriter writer = indexDoc();*
> >> >        assertEquals(1, writer.numDocs());
> >> >        IndexSearcher searcher = getSearcher(writer);
> >> >        final TermQuery termQuery = new TermQuery(new Term(content,
> >> > "essen"));
> >> >
> >> >        TopDocs docs = searcher.search(termQuery, 1);
> >> >        assertEquals(1, docs.totalHits);
> >> >        Document doc = searcher.doc(0);
> >> >
> >> > *        writer.updateDocument(new Term(id,doc.get(id)),doc);*
> >> >
> >> >        searcher = getSearcher(writer);
> >> > *        docs = searcher.search(termQuery, 1);*
> >> > *        assertEquals(1, docs.totalHits);*//docs.totalHits == 0 !
> >> >    }
> >> >
> >> > testUpdateLosesTerms(com.mysimpatico.me.indexplugins.WcTest)  Time
> >> elapsed:
> >> > 0.346 sec  <<< FAILURE!
> >> > java.lang.AssertionError: expected:<1> but was:<0>
> >> >    at org.junit.Assert.fail(Assert.java:91)
> >> >    at org.junit.Assert.failNotEquals(Assert.java:645)
> >> >    at org.junit.Assert.assertEquals(Assert.java:126)
> >> >    at org.junit.Assert.assertEquals(Assert.java:470)
> >> >    at org.junit.Assert.assertEquals(Assert.java:454)
> >> >    at
> >> >
> >>
> com.mysimpatico.me.indexplugins.WcTest.testUpdateLosesTerms(WcTest.java:271)
> >> >
> >> > I have not changed anything (as you can see) during the update. I just
> >> > retrieve a document and the update it. But then the termQuery that
> worked
> >> > before doesn't work anymore (while the "id" field wasn't changed). Is
> >> this
> >> > to be expected when content field is not stored?
> >> >
> >> > --
> >> > Regards,
> >> > K. Gabriele
> >> >
> >> > --- unchanged since 20/9/10 ---
> >> > P.S. If the subject contains "[LON]" or the addressee acknowledges the
> >> > receipt within 48 hours then I don't resend the email.
> >> > subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x,
this) ∧
> >> time(x)
> >> > < Now + 48h) ⇒ ¬resend(I, this).
> >> >
> >> > If an email is sent by a sender that is not a trusted contact or the
> >> email
> >> > does not contain a valid code then the email is not received. A valid
> >> code
> >> > starts with a hyphen and ends with "X".
> >> > ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x)
∧ y
> ∈
> >> > L(-[a-z]+[0-9]X)).
> >> >
> >>
> >
> >
> >
> > --
> > Regards,
> > K. Gabriele
> >
> > --- unchanged since 20/9/10 ---
> > P.S. If the subject contains "[LON]" or the addressee acknowledges the
> > receipt within 48 hours then I don't resend the email.
> > subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧
> time(x)
> > < Now + 48h) ⇒ ¬resend(I, this).
> >
> > If an email is sent by a sender that is not a trusted contact or the
> email
> > does not contain a valid code then the email is not received. A valid
> code
> > starts with a hyphen and ends with "X".
> > ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x)
∧ y ∈
> > L(-[a-z]+[0-9]X)).
> >
>



-- 
Regards,
K. Gabriele

--- unchanged since 20/9/10 ---
P.S. If the subject contains "[LON]" or the addressee acknowledges the
receipt within 48 hours then I don't resend the email.
subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x)
< Now + 48h) ⇒ ¬resend(I, this).

If an email is sent by a sender that is not a trusted contact or the email
does not contain a valid code then the email is not received. A valid code
starts with a hyphen and ends with "X".
∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈
L(-[a-z]+[0-9]X)).

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message