lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Erick Erickson <erickerick...@gmail.com>
Subject Re: Realtime get not always returning existing data
Date Thu, 27 Sep 2018 16:51:27 GMT
What version of Solr are you running? Mostly that's for curiosity.

Is the doc that's not returned something you've recently indexed?
Here's a possible scenario:
You send the doc out to be indexed. The primary forwards the doc to
the followers. Before the follower has a chance to process (but not
commit), you issue a RTG against that doc and it happens to be routed
to a node that hasn't received it from the leader yet. Does this sound
plausible in your scenario?

Hmmm, I suppose it's not even a requirement that the request gets sent
to a follower, it could easily be "in process" on the leader/primary.

Best,
Erick
On Wed, Sep 26, 2018 at 11:55 AM sgaron cse <sgaron.cse@gmail.com> wrote:
>
> Hey all,
>
> We're trying to use SOLR for our document store and are facing some issues
> with the Realtime Get api. Basically, we're doing an api call from multiple
> endpoint to retrieve configuration data. The document that we are
> retrieving does not change at all but sometimes the API returns a null
> document ({doc:null}). I'd say 99.99% of the time we can retrieve the
> document fine but once in a blue moon we get the null document. The problem
> is that for us, if SOLR returns null, that means that the document does not
> exist but because this is a document that should be there it causes all
> sort of problems in our system.
>
> The API I call is the following:
> http://{server_ip}/solr/config/get?id={id}&wt=json&fl=_source_
>
> As far as I understand reading the documentation, the Realtime Get API
> should get me the document no matter what. Even if the document is not yet
> committed to the index.
>
> I see no errors whatsoever in the SOLR logs that could help me with this
> problem. in fact there are no error at all.
>
> As for our setup, because we're still in testing phase, we only have two
> SOLR instances running on the same box in cloud mode with replication=1
> which means that the core that we run the Realtime Get on is only present
> in one of the two instances. Our script randomly chooses which instances it
> does the query on but as far as I understand, in cloud mode the API call
> should be dispatched automatically to the right instance.
>
> Am I missing anything here? Is it possible that there is a race condition
> in the Realtime Get API that could return null data even if the document
> exist?
>
> Thanks,
> Steve

Mime
View raw message