beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <>
Subject [jira] [Commented] (BEAM-2790) Error while reading from Amazon S3 via Hadoop File System
Date Tue, 22 Aug 2017 12:56:00 GMT


ASF GitHub Bot commented on BEAM-2790:

GitHub user iemejia opened a pull request:

    [BEAM-2790] Use byte[] instead of ByteBuffer to read from Hadoop FS

    Follow this checklist to help us incorporate your contribution quickly and easily:
     - [x] Make sure there is a [JIRA issue](
filed for the change (usually before you start working on it).  Trivial changes like typos
do not require a JIRA issue.  Your pull request should address just this issue, without pulling
in other changes.
     - [x] Each commit in the pull request should have a meaningful subject line and body.
     - [x] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`,
where you replace `BEAM-XXX` with the appropriate JIRA issue.
     - [x] Write a pull request description that is detailed enough to understand what the
pull request does, how, and why.
     - [x] Run `mvn clean verify` to make sure basic checks pass. A more thorough check will
be performed on your pull request automatically.
     - [ ] If this contribution is large, please file an Apache [Individual Contributor License

You can merge this pull request into a Git repository by running:

    $ git pull BEAM-2790-fix-s3-read

Alternatively you can review and apply these changes as the patch at:

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #3744
commit 76f2d91bc9f612d4e98fe7d8099dffd36c47ff85
Author: Ismaël Mejía <>
Date:   2017-08-22T12:50:12Z

    [BEAM-2790] Use byte[] instead of ByteBuffer to read from Hadoop FS


> Error while reading from Amazon S3 via Hadoop File System
> ---------------------------------------------------------
>                 Key: BEAM-2790
>                 URL:
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-java-extensions
>    Affects Versions: 2.0.0, 2.1.0
>            Reporter: Ismaël Mejía
>            Assignee: Ismaël Mejía
> If you try to use hadoop-aws with Beam to read from AWS S3 it breaks because S3AInputStream
(the implementation of Hadoop's FSDataInputStream) is not ByteBufferReadable. 
> [code]
> Exception in thread "main" java.lang.UnsupportedOperationException: Byte-buffer read
unsupported by input stream
> 	at
> 	at$
> 	at$TextBasedReader.tryToEnsureNumberOfBytesInBuffer(
> 	at$TextBasedReader.findSeparatorBounds(
> 	at$TextBasedReader.readNextRecord(
> 	at$FileBasedReader.advanceImpl(
> 	at$FileBasedReader.startImpl(
> 	at$OffsetBasedReader.start(
> [code]

This message was sent by Atlassian JIRA

View raw message