libcloud-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [libcloud] c-w commented on issue #1399: Upload large file to Azure Blobs
Date Fri, 03 Jan 2020 21:31:37 GMT
c-w commented on issue #1399: Upload large file to Azure Blobs
URL: https://github.com/apache/libcloud/issues/1399#issuecomment-570704353
 
 
   **TL;DR**
   The maximum file size currently supported by the Azure Storage driver is 256 MB. Uploading
larger file sizes will require a code change in libcloud.
   
   **Details**
   The Azure Storage driver's implementation of [upload_object_via_stream](https://github.com/apache/libcloud/blob/6dca82e649456b42d23f439854d3dc807c806abf/libcloud/storage/drivers/azure_blobs.py#L822-L841)
delegates to [_put_object](https://github.com/apache/libcloud/blob/6dca82e649456b42d23f439854d3dc807c806abf/libcloud/storage/drivers/azure_blobs.py#L945-L951)
which calls through to the generic [_upload_object](https://github.com/apache/libcloud/blob/6dca82e649456b42d23f439854d3dc807c806abf/libcloud/storage/base.py#L584-L592)
which does a single PUT request to the storage backend. Given that [we're using Azure Storage
API version 2016-05-31](https://github.com/apache/libcloud/blob/6dca82e649456b42d23f439854d3dc807c806abf/libcloud/storage/drivers/azure_blobs.py#L180),
according to the [Put Blob documentation](https://docs.microsoft.com/en-us/rest/api/storageservices/put-blob#remarks),
the maximum file size that can be uploaded in one Put Blob request is 256 MB. As such, to
support uploading files larger than 256 MB, the Azure Storage driver would have to implement
chunked blob upload via [Put Block](https://docs.microsoft.com/en-us/rest/api/storageservices/put-block)
and [Put Block List](https://docs.microsoft.com/en-us/rest/api/storageservices/put-block-list).
It looks like the Azure Storage driver used to implement the chunked blob upload flow (e.g.
see [24f34c9](https://github.com/apache/libcloud/blob/24f34c99c9440523a53e940a346bced551281953/libcloud/storage/drivers/azure_blobs.py#L732-L788)).
However, since [6e0040d](https://github.com/apache/libcloud/commit/6e0040d8904cacb5dbe88309e9051be08cdc59f9)
the driver doesn't have support for chunked blob upload anymore.
   
   I encountered this limitation in several other projects (e.g. https://github.com/ascoderu/opwen-cloudserver/issues/219)
so I will try to find some time and work on a fix.
   
   **Work-around**
   If you require access to Azure Storage via libcloud for uploading large files right now
before the fix mentioned above is implemented, I would suggest to try the following: The [libcloud
S3 driver currently implements chunked upload](https://github.com/apache/libcloud/blob/6dca82e649456b42d23f439854d3dc807c806abf/libcloud/storage/drivers/s3.py#L688-L694),
so you could try deploying [MinIO](https://github.com/minio/minio) as a [gateway for Azure
Storage](https://docs.min.io/docs/minio-gateway-for-azure.html) and using the libcloud S3
driver to talk to the MinIO frontend which in turn communicates efficiently with the Azure
Storage backend. For MinIO [947bc8c](https://github.com/minio/minio/commit/947bc8c7d3b8ad98cdbb6ce0f8dea155df16aadf)
and later, this approach should work for all types of cloud-based Azure Storage accounts (e.g.
Storage, StorageV2, BlobStorage) as well as Azurite and Azure IoT Edge Storage. Once chunked
blob upload is fixed in libcloud, you should be able to remove the MinIO indirection and switch
to libcloud's Azure Storage driver with no additional code changes required.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

Mime
View raw message