lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mark Miller (JIRA)" <>
Subject [jira] [Commented] (SOLR-5150) HdfsIndexInput may not fully read requested bytes.
Date Mon, 19 Aug 2013 13:39:48 GMT


Mark Miller commented on SOLR-5150:

bq.  But the sync simply kills concurrent query reads. 

Sorry, I was not being very careful with my words. The 'sync' option (with the seek + read)
kills concurrent query reads - but I don't think it's the sync at all. The first perf tests
I looked at with just a readFully had a sync as well - which seems to make sense because this
is not an NRT test or anything. Everything seems to be related to the hdfs calls.
> HdfsIndexInput may not fully read requested bytes.
> --------------------------------------------------
>                 Key: SOLR-5150
>                 URL:
>             Project: Solr
>          Issue Type: Bug
>    Affects Versions: 4.4
>            Reporter: Mark Miller
>            Assignee: Mark Miller
>             Fix For: 4.5, 5.0
>         Attachments: SOLR-5150.patch
> Patrick Hunt noticed that our HdfsDirectory code was a bit behind Blur here - the read
call we are using may not read all of the requested bytes - it returns the number of bytes
actually written - which we ignore.
> Blur moved to using a seek and then readFully call - synchronizing across the two calls
to deal with clones.
> We have seen that really kills performance, and using the readFully call that lets you
pass the position rather than first doing a seek, performs much better and does not require
the synchronization.
> I also noticed that the seekInternal impl should not seek but be a no op since we are
seeking on the read.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message