lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From sgaron cse <sgaron....@gmail.com>
Subject Re: Realtime get not always returning existing data
Date Thu, 27 Sep 2018 17:10:46 GMT
Hey Erick,

We're using SOLR 7.3.1, which is not the latest but still not too far back.

No the document has not been recently indexed, in fact, I can use the
/search API endpoint to find the document. But I need a fast way to find
document that have not necessarily been indexed yet so /search is out of
the question. Also to put you in context, last time the doc was modified
was 3 days ago but we are still seing the occasional doc:null return from
the Realtime Get API.

Steve

On Thu, Sep 27, 2018 at 12:52 PM Erick Erickson <erickerickson@gmail.com>
wrote:

> What version of Solr are you running? Mostly that's for curiosity.
>
> Is the doc that's not returned something you've recently indexed?
> Here's a possible scenario:
> You send the doc out to be indexed. The primary forwards the doc to
> the followers. Before the follower has a chance to process (but not
> commit), you issue a RTG against that doc and it happens to be routed
> to a node that hasn't received it from the leader yet. Does this sound
> plausible in your scenario?
>
> Hmmm, I suppose it's not even a requirement that the request gets sent
> to a follower, it could easily be "in process" on the leader/primary.
>
> Best,
> Erick
> On Wed, Sep 26, 2018 at 11:55 AM sgaron cse <sgaron.cse@gmail.com> wrote:
> >
> > Hey all,
> >
> > We're trying to use SOLR for our document store and are facing some
> issues
> > with the Realtime Get api. Basically, we're doing an api call from
> multiple
> > endpoint to retrieve configuration data. The document that we are
> > retrieving does not change at all but sometimes the API returns a null
> > document ({doc:null}). I'd say 99.99% of the time we can retrieve the
> > document fine but once in a blue moon we get the null document. The
> problem
> > is that for us, if SOLR returns null, that means that the document does
> not
> > exist but because this is a document that should be there it causes all
> > sort of problems in our system.
> >
> > The API I call is the following:
> > http://{server_ip}/solr/config/get?id={id}&wt=json&fl=_source_
> >
> > As far as I understand reading the documentation, the Realtime Get API
> > should get me the document no matter what. Even if the document is not
> yet
> > committed to the index.
> >
> > I see no errors whatsoever in the SOLR logs that could help me with this
> > problem. in fact there are no error at all.
> >
> > As for our setup, because we're still in testing phase, we only have two
> > SOLR instances running on the same box in cloud mode with replication=1
> > which means that the core that we run the Realtime Get on is only present
> > in one of the two instances. Our script randomly chooses which instances
> it
> > does the query on but as far as I understand, in cloud mode the API call
> > should be dispatched automatically to the right instance.
> >
> > Am I missing anything here? Is it possible that there is a race condition
> > in the Realtime Get API that could return null data even if the document
> > exist?
> >
> > Thanks,
> > Steve
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message