incubator-deltacloud-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "" <>
Subject Re: Blob creation
Date Thu, 18 Aug 2011 08:07:57 GMT
Hi Chris, (inline)

On 17/08/11 21:12, David Lutterkort wrote:
> On Wed, 2011-08-17 at 09:57 -0400, Chris Lalancette wrote:
>> Hey Marios,
>>       I know this is several months out of date, but I was just doing some
>> testing on the blob creation stuff and noticing that my libdeltacloud tests
>> were failing.  I traced it down to the fact that the blob_id parameter changed
>> from param[:blob_id] to param[:blob] when you added the streaming stuff to
>> blobs.

Yes thats right. Initially we had one operation for creating blobs:

POST /api/buckets/:bucket

and this accepted (amongst others) the 'blob_id' parameter to define the 
name of the blob

Then, in order to implement streaming PUT through deltacloud I added:

PUT /api/buckets/:bucket/:blob

The name change for the parameter was, I can only guess, some attempt to 
maintain consistency (i.e. 'blob' over 'blob_id') though in hindsight 
was not really necessary. Your suggested patch:

post "#{Sinatra::UrlForHelper::DEFAULT_URI_PREFIX}/buckets/:bucket" do
    bucket_id = params[:bucket]
-  blob_id = params['blob']
+  blob_id = params['blob'] || params['blob_id']

seems fine to me in that it won't break anything. If it maintains 
compatibility with your stuff then I personally have no objection to 
making this addition. More on PUT vs POST below

> I think it's another case where the code does somehing special for the
> HTML UI - the official API for creating a new blob is
> PUT /api/buckets/:bucket/:blob; looking at this now, it seems strange
> that we have two different ways to create blobs, and I am wondering if
> we shouldn't drop the PUT, and only use POST for everything.

Yes, we have two methods for creating blobs: POST 
( and PUT 

The POST method is non-streaming:

client ---TEMP_FILE---> deltacloud ---STREAM---> provider

i.e., the client sends the blob to deltacloud, which receives the entire 
request and creates a temp_file for the blob data, and then streams this 
to the provider.

The PUT operation is streaming:

client ---STREAM---> deltacloud ---STREAM---> provider

i.e., the client sends the blob to deltacloud, which does not wait to 
receive the entire request and instead starts streaming the blob data to 
the provider as this is received.

Now, in order to create a blob on a given cloud provider service, you 
invariably must specify the content_length of the blob. For a PUT 
operation, the content_length is exactly as defined by the sending 
client in the PUT to deltacloud. Thus, we can take that content_length 
and start sending the data to the provider as we are receiving it.

However, for a POST operation, the content_length of the blob is not 
what is sent for the client POST operation to deltacloud, due to the 
presence of the multipart/form-data boundary, which will vary depending 
on the sending client. It became very messy/difficult to try and parse 
the boundary and 'guess' the content length of the blob in order to 
start streaming, which is why we decided to go with PUT. In fact, the 
cloud providers themselves (EC2, rackspace, Azure) use PUT operations to 
create blobs (with POST supported as an alternative).

Thus, we have both POST (non streaming, only to support HTML forms and 
the web browser interface) and PUT (streaming). If we want to remove one 
of those methods then I would definitely vote to remove POST since imho 
the streaming functionality for creating blobs is absolutely necessary 
for 'real world' use. Forcing deltacloud to buffer all blob objects 
before sending them on to the provider is obviously not very useful.


> David

View raw message