hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-15870) S3AInputStream.remainingInFile should use nextReadPos
Date Tue, 23 Oct 2018 15:02:00 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-15870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16660784#comment-16660784

ASF GitHub Bot commented on HADOOP-15870:

Github user steveloughran commented on the issue:

    1. I think the same problem surfaces in `remainingInCurrentRequest()` too, which is where
some inspection is going to be needed to carefully understand what's up.
    1. For this method, I think maybe we could cut it completely, and fix up `available()`
to return a default value when `wrappedStream==null` (0 or maybe 1 except when at EOF), and
when the stream isn't null, delegate to it. That pushes down the problem of estimating the
number of non-blocking bytes into the http connector.
    In `AbstractContractSeekTest` there's some existing tests which could take some more asserts
on that available call
    * `testSeekZeroByteFile`: `available() == 0`, always
    * `testSeekReadClosedFile` : call available() on an empty file, it should raise some IllegalStateException
or IOE, but not an NPE
    For some of the other tests, I think you insert a couple of checks to say " available
> 0" after a seek + read() Call. This also clarifies that "what should be available after
a seek but before a read()?" As for cloudstores with lazy seek, available == 0, though if
that breaks gzip then maybe they should return 1, so the assert should be available >1
with an option in the contract to say "actually returns 0". 
    It's a tricky one to test on because really, the sole asserts should be
    1. fails if the stream is closed (unless its a 0?)
    1. returns >= if the stream is open
    1. is always < remaining file length.
    Probably assert #3 is the one to check for: available() is valid when it is >=0 and
<= (filelength -getPos()), with care taken about that second clause to make sure there's
no off-by-one errors in the assert/code

> S3AInputStream.remainingInFile should use nextReadPos
> -----------------------------------------------------
>                 Key: HADOOP-15870
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15870
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 2.8.4, 3.1.1
>            Reporter: Shixiong Zhu
>            Priority: Major
> Otherwise `remainingInFile` will not change after `seek`.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org

View raw message