airavata-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Miller, Mark" <mmil...@sdsc.edu>
Subject RE: [#Spring17-Airavata-Courses] Reliable file uploads for Science Gateway Portals
Date Mon, 08 May 2017 13:08:41 GMT
Hi Ameya,

We have thought about this at CIPRES as well.
Terri Schwartz in our group has implemented a Java version of resumable file transfer in tus.io,
and I have mentioned this as a possible SciGaP/Airavata service. I would talk recommend you
talk with Terri and see if there is synergy between your thinking, and what she has done.

Mark

From: Ameya Advankar [mailto:aadvanka@umail.iu.edu]
Sent: Sunday, May 07, 2017 9:23 PM
To: dev@airavata.apache.org
Subject: [#Spring17-Airavata-Courses] Reliable file uploads for Science Gateway Portals

Hi Airavata Developers,

I have been exploring and evaluating tus.io<http://tus.io/>  as a solution to the following
problems encountered in Science Gateway Portals related to the file upload functionality:

1. Unreliable HTTP Connection

Since the File uploads in Science Gateway Portals are HTTP uploads, these are heavily reliant
on a continuous internet connection being available on the client machine. There could be
network disruptions or connectivity issues and a traditional file upload will fail in this
case. As a result, the users may have to retry the uploads manually and wait for a successful
upload to take place. If the files are large i.e. a few hundred Megabytes or some Gigabytes,
this will cause a waste of bandwidth and time.

2. Space constraints on the Server

The file which is being uploaded usually would be staged somewhere on the Server for a certain
period of time till it is picked up for further processing. The file may be there for a considerable
amount of time depending on the process queuing time. In case multiple large files are uploaded
at the same time by users of a Portal, the host machine may run out of space and this could
have adverse affects on the performance of the Portal.

Also, there could be cases in which multiple Science Gateway Portals are hosted on a the same
web server. In such a setting, if a particular Portal fills up server space with multiple
large files, it may affect the performance of other Portals residing on that web server as
well. In short, the file-upload functionality should not affect the Portal performance.


We could use some JavaScript libraries such as Fine Uploader<https://fineuploader.com/>,
Resumable.js<http://www.resumablejs.com>  or flow.js<https://github.com/flowjs/flow.js>
which provide a simple client side library for resumable file uploads in case of disruptions.
Fine Uploader seems to be the best among these libraries based on the community usage and
contribution on Github. For each of these client side libraries, we have to incorporate the
corresponding server side code to handle resumable uploads.

However since their JavaScript implementations are unique, with each library using their own
set of parameters and request headers to achieve Resumable functionality, the Server implementation
which we adopt will be tightly coupled with the library we choose. This will introduce a dependency
on the library.
To remove the library based dependency, we can use a client-server implementation of tus.io<http://tus.io>
protocol. Using a protocol will reduce the library level dependency to a protocol level dependency.

Also since tus.io<http://tus.io> client could be implemented in any language, we could
have multiple types of Gateway Portals such as Web, Desktop and native which connect to the
same tus.io<http://tus.io> based server.

The Second problem of space constraint can be solved by separating the file-upload process
as a micro service located on a separate host. Each Portal could have its own separate micro-service
instance and this way the file-upload functionality will not hamper the Portal performance.
Further, the micro-service will have to be secured and tus.io<http://tus.io> allows
us to do this via the tus.io hooks<https://github.com/tus/tusd/blob/master/docs/hooks.md>
feature by implementing the Auth code in the pre-create hook.

Thus, tus.io<http://tus.io> seems to be a good, flexible and maintainable solution for
implementing file-upload functionality in Science Gateway Portals from a long term perspective.

Thanks & Regards,
Ameya Advankar
Masters in Computer Science,
Indiana University Bloomington
Mime
View raw message