airavata-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ameya Advankar <aadva...@umail.iu.edu>
Subject [#Spring17-Airavata-Courses] Reliable file uploads for Science Gateway Portals
Date Mon, 08 May 2017 04:23:14 GMT
Hi Airavata Developers,

I have been exploring and evaluating tus.io  as a solution to the following
problems encountered in Science Gateway Portals related to the file upload
functionality:

*1. Unreliable HTTP Connection*


Since the File uploads in Science Gateway Portals are HTTP uploads, these
are heavily reliant on a continuous internet connection being available on
the client machine. There could be network disruptions or connectivity
issues and a traditional file upload will fail in this case. As a result,
the users may have to retry the uploads manually and wait for a successful
upload to take place. If the files are large i.e. a few hundred Megabytes
or some Gigabytes, this will cause a waste of bandwidth and time.


*2. Space constraints on the Server*


The file which is being uploaded usually would be staged somewhere on the
Server for a certain period of time till it is picked up for further
processing. The file may be there for a considerable amount of time
depending on the process queuing time. In case multiple large files are
uploaded at the same time by users of a Portal, the host machine may run
out of space and this could have adverse affects on the performance of the
Portal.


Also, there could be cases in which multiple Science Gateway Portals are
hosted on a the same web server. In such a setting, if a particular Portal
fills up server space with multiple large files, it may affect the
performance of other Portals residing on that web server as well. In short,
the file-upload functionality should not affect the Portal performance.



We could use some JavaScript libraries such as Fine Uploader
<https://fineuploader.com/>, Resumable.js <http://www.resumablejs.com>  or
flow.js <https://github.com/flowjs/flow.js> which provide a simple client
side library for resumable file uploads in case of disruptions. Fine
Uploader seems to be the best among these libraries based on the community
usage and contribution on Github. For each of these client side libraries,
we have to incorporate the corresponding server side code to handle
resumable uploads.

However since their JavaScript implementations are unique, with each
library using their own set of parameters and request headers to achieve
Resumable functionality, the Server implementation which we adopt will be
tightly coupled with the library we choose. This will introduce a
dependency on the library.
To remove the library based dependency, we can use a client-server
implementation of tus.io protocol. Using a protocol will reduce the library
level dependency to a protocol level dependency.

Also since tus.io client could be implemented in any language, we could
have multiple types of Gateway Portals such as Web, Desktop and native
which connect to the same tus.io based server.

The Second problem of space constraint can be solved by separating the
file-upload process as a micro service located on a separate host. Each
Portal could have its own separate micro-service instance and this way the
file-upload functionality will not hamper the Portal performance. Further,
the micro-service will have to be secured and tus.io allows us to do this
via the tus.io hooks
<https://github.com/tus/tusd/blob/master/docs/hooks.md> feature
by implementing the Auth code in the *pre-create* hook.

Thus, tus.io seems to be a good, flexible and maintainable solution for
implementing file-upload functionality in Science Gateway Portals from a
long term perspective.

Thanks & Regards,
Ameya Advankar
Masters in Computer Science,
Indiana University Bloomington

Mime
View raw message