From notifications-return-15264-archive-asf-public=cust-asf.ponee.io@libcloud.apache.org  Sun May 26 19:27:08 2019
Return-Path: <notifications-return-15264-archive-asf-public=cust-asf.ponee.io@libcloud.apache.org>
X-Original-To: archive-asf-public@cust-asf.ponee.io
Delivered-To: archive-asf-public@cust-asf.ponee.io
Received: from mail.apache.org (hermes.apache.org [207.244.88.153])
	by mx-eu-01.ponee.io (Postfix) with SMTP id 7EF6D18062B
	for <archive-asf-public@cust-asf.ponee.io>; Sun, 26 May 2019 21:27:08 +0200 (CEST)
Received: (qmail 38381 invoked by uid 500); 26 May 2019 19:27:07 -0000
Mailing-List: contact notifications-help@libcloud.apache.org; run by ezmlm
Precedence: bulk
List-Help: <mailto:notifications-help@libcloud.apache.org>
List-Unsubscribe: <mailto:notifications-unsubscribe@libcloud.apache.org>
List-Post: <mailto:notifications@libcloud.apache.org>
List-Id: <notifications.libcloud.apache.org>
Reply-To: dev@libcloud.apache.org
Delivered-To: mailing list notifications@libcloud.apache.org
Received: (qmail 38371 invoked by uid 99); 26 May 2019 19:27:07 -0000
Received: from ec2-52-202-80-70.compute-1.amazonaws.com (HELO gitbox.apache.org) (52.202.80.70)
    by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 26 May 2019 19:27:07 +0000
From: GitBox <git@apache.org>
To: notifications@libcloud.apache.org
Subject: [GitHub] [libcloud] c-w commented on a change in pull request #1287:
 [LIBCLOUD-1043] Fix Azure upload_object_via_stream used with iter
Message-ID: <155889882746.11069.3869366733680166719.gitbox@gitbox.apache.org>
Date: Sun, 26 May 2019 19:27:07 -0000
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit

c-w commented on a change in pull request #1287: [LIBCLOUD-1043] Fix Azure upload_object_via_stream used with iter
URL: https://github.com/apache/libcloud/pull/1287#discussion_r287611116
 
 
 ##########
 File path: libcloud/storage/drivers/azure_blobs.py
 ##########
 @@ -825,7 +826,12 @@ def upload_object_via_stream(self, iterator, container, object_name,
         """
         self._check_values(ex_blob_type, ex_page_blob_size)
         if ex_blob_type == "BlockBlob":
-            iterator.seek(0, os.SEEK_END)
+            try:
+                iterator.seek(0, os.SEEK_END)
+            except AttributeError:
+                buffer = BytesIO()
+                buffer.writelines(iterator)
 
 Review comment:
   Yes, this does buffer the whole iterator in memory unfortunately.
   
   Given that [the content-size header is required](https://docs.microsoft.com/en-us/rest/api/storageservices/put-blob#request-headers-all-blob-types) we have to find the size of the iterator before making the request so I don't see a way to avoid this for the general case. E.g. using [tee](https://docs.python.org/3/library/itertools.html#itertools.tee) followed by something like [ilen](https://more-itertools.readthedocs.io/en/stable/_modules/more_itertools/more.html#ilen) will still copy the iterator in memory.
   
   One potential work-around could be to upload chunks from the iterator via individual [put block](https://docs.microsoft.com/en-us/rest/api/storageservices/put-block) requests followed with a [put block list](https://docs.microsoft.com/en-us/rest/api/storageservices/put-block-list) request, but that would be a much more invasive change to the codebase. (Note that I also haven't proved out this approach in code yet so for now it's just a hypothesis from reading the docs.)
   
   Do you have any suggestions to improve this?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services