streams-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From smashew <...@git.apache.org>
Subject [GitHub] incubator-streams pull request: Twitter Modificaitons
Date Mon, 05 May 2014 22:12:00 GMT
Github user smashew commented on a diff in the pull request:

    https://github.com/apache/incubator-streams/pull/8#discussion_r12302725
  
    --- Diff: streams-contrib/streams-amazon-aws/streams-persist-s3/src/main/java/org/apache/streams/s3/S3ObjectInputStreamWrapper.java
---
    @@ -0,0 +1,111 @@
    +package org.apache.streams.s3;
    +
    +import com.amazonaws.services.s3.model.S3Object;
    +import com.amazonaws.services.s3.model.S3ObjectInputStream;
    +import org.slf4j.Logger;
    +import org.slf4j.LoggerFactory;
    +
    +import java.io.Closeable;
    +import java.io.IOException;
    +import java.io.InputStream;
    +
    +import com.amazonaws.services.s3.model.S3Object;
    +import com.amazonaws.services.s3.model.S3ObjectInputStream;
    +import org.slf4j.Logger;
    +import org.slf4j.LoggerFactory;
    +
    +import java.io.Closeable;
    +import java.io.IOException;
    +import java.io.InputStream;
    +
    +/**
    + * There is a stupid nuance associated with reading portions of files in S3. Everything
occurs over
    + * an Apache HTTP client object. Apache defaults to re-using the stream. So, if you only
want to read
    + * a small portion of the file. You must first "abort" the stream, then close. Otherwise,
Apache will
    + * exhaust the stream and transfer a ton of data attempting to do so.
    + *
    + *
    + * Author   Smashew
    + * Date     2014-04-11
    + *
    + * After a few more days and some demos that had some issues with concurrency and high
user load. This
    + * was also discovered. There is an issue with the S3Object's HTTP connection not being
released back
    + * to the connection pool (until it times out) even once the item is garbage collected.
So....
    + *
    + * Reference:
    + * http://stackoverflow.com/questions/17782937/connectionpooltimeoutexception-when-iterating-objects-in-s3
    + */
    +public class S3ObjectInputStreamWrapper extends InputStream
    +{
    +    private final static Logger LOGGER = LoggerFactory.getLogger(S3ObjectInputStreamWrapper.class);
    +
    +    private final S3Object s3Object;
    +    private final S3ObjectInputStream is;
    +    private boolean isClosed = false;
    +
    +    public S3ObjectInputStreamWrapper(S3Object s3Object) {
    +        this.s3Object = s3Object;
    +        this.is = this.s3Object.getObjectContent();
    +    }
    +
    +    public int hashCode()                                           { return this.is.hashCode();
}
    --- End diff --
    
    changing to streams preference.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

Mime
View raw message