streams-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From mfranklin <...@git.apache.org>
Subject [GitHub] incubator-streams pull request: Twitter Modificaitons
Date Mon, 05 May 2014 21:34:27 GMT
Github user mfranklin commented on a diff in the pull request:

    https://github.com/apache/incubator-streams/pull/8#discussion_r12301085
  
    --- Diff: streams-contrib/streams-amazon-aws/streams-persist-s3/src/main/java/org/apache/streams/s3/S3ObjectInputStreamWrapper.java
---
    @@ -0,0 +1,111 @@
    +package org.apache.streams.s3;
    +
    +import com.amazonaws.services.s3.model.S3Object;
    +import com.amazonaws.services.s3.model.S3ObjectInputStream;
    +import org.slf4j.Logger;
    +import org.slf4j.LoggerFactory;
    +
    +import java.io.Closeable;
    +import java.io.IOException;
    +import java.io.InputStream;
    +
    +import com.amazonaws.services.s3.model.S3Object;
    +import com.amazonaws.services.s3.model.S3ObjectInputStream;
    +import org.slf4j.Logger;
    +import org.slf4j.LoggerFactory;
    +
    +import java.io.Closeable;
    +import java.io.IOException;
    +import java.io.InputStream;
    +
    +/**
    + * There is a stupid nuance associated with reading portions of files in S3. Everything
occurs over
    + * an Apache HTTP client object. Apache defaults to re-using the stream. So, if you only
want to read
    + * a small portion of the file. You must first "abort" the stream, then close. Otherwise,
Apache will
    + * exhaust the stream and transfer a ton of data attempting to do so.
    + *
    + *
    + * Author   Smashew
    + * Date     2014-04-11
    + *
    + * After a few more days and some demos that had some issues with concurrency and high
user load. This
    + * was also discovered. There is an issue with the S3Object's HTTP connection not being
released back
    + * to the connection pool (until it times out) even once the item is garbage collected.
So....
    + *
    + * Reference:
    + * http://stackoverflow.com/questions/17782937/connectionpooltimeoutexception-when-iterating-objects-in-s3
    + */
    +public class S3ObjectInputStreamWrapper extends InputStream
    +{
    +    private final static Logger LOGGER = LoggerFactory.getLogger(S3ObjectInputStreamWrapper.class);
    +
    +    private final S3Object s3Object;
    +    private final S3ObjectInputStream is;
    +    private boolean isClosed = false;
    +
    +    public S3ObjectInputStreamWrapper(S3Object s3Object) {
    +        this.s3Object = s3Object;
    +        this.is = this.s3Object.getObjectContent();
    +    }
    +
    +    public int hashCode()                                           { return this.is.hashCode();
}
    --- End diff --
    
    Can you update this formatting to be consistent?  
    
    EG)
    
    public int hashCode() {
        return this.is.hashCode();
    }


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

Mime
View raw message