oodt-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Susana Sanchez Exposito <...@iaa.es>
Subject Re: help: OODT component for distributing data through WAN
Date Fri, 17 Feb 2017 10:19:35 GMT
Thanks Tom,

>From your answer I guess that I can use the OODT component File Manager to
delivery large data products (from GBs to TBs) to users located remotely
(i.e users that are globally distributed).

I have still some doubts, let me add them between your lines:

2017-02-16 13:18 GMT+01:00 Tom Barber <magicaltrout@apache.org>:

> Hi Susana
> Welcome to the OODT list, this is indeed the correct place to ask about
> OODT related stuff.
> How you deliver data, I guess often depends on your requirements, but OODT
> was certainly designed with that type of thing in mind.
> The file manager is very flexible in terms of storage and is a portal
> allowing for the ingestion of data products to a file store, this could be
> a folder on a disk, nfs mount or something else, a HDFS cluster, S3 or
> something completely different. So the system will ingest data into the

Do you mean that I can connect the File manager with the users' file
stores, so when the File Manager stores the data products, in the practice,
what it would be doing is to delivery the data products to the users?

Given the users' file stores would be located remotely (possibly through
high latency networks), I would worried about the performance of this

In addition, with this option I would not be able to select/filter which
data products are delivered to each user, based on the metadata of the

> file manager either through an API call, a crawling service or something
> else. During this operation metadata from the ingested files is then
> extracted, for example if this were an image, you could extract EXIF data,
> GEO data etc and then store that in the catalogue alongside the ingested
> product.
> There is a basic UI for showing ingested products called Ops UI, but in
> reality for deployment as a service there would be a web interface written
> to integrate into whatever application or portal you are already using,
> which would then allow users to search for products via metadata or keys in
> the ingested data. From that search users could then do a range of things
> depending on what your requirements are, the simplest being clicking a link
> to download the product. But of course it could be triggering a workflow,
> copying the file somewhere else or whatever.
> Behind the File Manager is also the workflow manager, so another scenario
> might be to ingest files into the file manager, which in turn triggers a
> workflow which then distributes the ingest files to people automatically,
> or performs some post processing etc.

Ok. So, I would need to implement this workflow in such a way that 1) it
selects/filters which data products will be delivered to each user  and 2)
it sends the data products to the remote users, by means of efficient tools
for data movement (e.g. GridFTP)

> Let us know if you have any further questions.

Thanks again!


> Tom
> On Thu, Feb 16, 2017 at 7:56 AM, Susana Sanchez <susanasanche@gmail.com>
> wrote:
> > Dear all,
> >
> > I am trying to find out which of the components of Apache OODT is the
> most
> > suitable for delivering large data products to users located remotely
> > (users distributed on a WAN network)
> >
> > I have read the CAS File Manager has the capability to archive a file to
> a
> > remote location, so it could be a candidate. However it seems, this
> > component was not designed for this purpose, so it is not recommended for
> > distributing data through a  WAN network. Is that correct?
> >
> > I think the components that I am looking for are the Grid product
> services
> > (Product server/client, Profile server/client, Query server/client). Am I
> > right?
> > If not, I would like to ask you to provide some information about which
> > OODT components I need to distribute data products through international
> > networks.
> >
> > I was not sure if this is the correct email list to send this kind of
> > question. If not, sorry about that and it would be appreciate if you
> could
> > forward it to the appropriate email address.
> >
> > Thanks in advance,
> > Susana.
> >

Susana Sánchez Expósito

Instituto de Astrofísica de Andalucía - CSIC
Glorieta de la Astronomía, s/n. E-18008, Granada
Tel:(+34) 958 121 311 / (+34) 958 230 635
Fax:(+34) 958 814 530
e-mail: sse@iaa.es

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message