oodt-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Etienne Koen <etien...@scs-space.com>
Subject RE: Remote data transfer
Date Wed, 20 Aug 2014 07:34:38 GMT
Hi Chris and Tom,

We are still busy at SKA with refining the first test plan for data delivery and archive.
There are a few things that came out of yesterdays meeting about the requirements which will
be added to the tests which I need to explore using OODT as well

- parallel transfer
- transfer with/without a checksum calculation

Would it be possible to point me to some documentation/tutorial again which describes both
the use of these capabilities?

btw, I got the pushpull component of OODT running on the CHPC with some elementary tests!
:-)

I will forward the documentation to you once it is completed.

Regards
Etienne
________________________________________
From: Mattmann, Chris A (3980) [chris.a.mattmann@jpl.nasa.gov]
Sent: Monday, August 18, 2014 11:27 PM
To: dev@oodt.apache.org
Cc: Etienne Koen; Thomas Bennett; cschollar@ska.ac.za
Subject: Re: Remote data transfer

Hi Tom,

Great question!

By default, all of the protocols support here in the cas-protocol
module of Apache OODT:

http://svn.apache.org/repos/asf/oodt/trunk/protocol/


* ftp
* http(s)
* imaps
* sftp

Note that there is an Amazon S3 "data transfer" module in
the File Manager, but not explicitly in Push Pull. It would
be hopefully not too difficult (and a welcomed patch!) to
incorporate this functionality into the cas-protcool layer.

There are also these specific plugins PushPull plugins:

https://cwiki.apache.org/confluence/display/OODT/OODT+Push+Pull+Plugins


Note the Push Pull plugins in the wiki page above leverage LGPL libraries
and I wasn't able to find a replacement for them. We aren't officially
"recommending" them as Apache OODT PMC members, but they are useful
FTP plugins if you can't get the existing protocol-ftp plugin to work.
You knowingly however do so by explicitly downloading these plugins
and building them into your OODT push pull installation.

I would love if someone were to find ALv2 compatible versions of the
above plugins so we could manage them in our code base but hasn't
be done yet.



++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: chris.a.mattmann@nasa.gov
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++






-----Original Message-----
From: Thomas Bennett <lmzxq.tom@gmail.com>
Reply-To: "dev@oodt.apache.org" <dev@oodt.apache.org>
Date: Monday, August 18, 2014 11:39 AM
To: "dev@oodt.apache.org" <dev@oodt.apache.org>
Cc: Etienne Koen <etiennek@scs-space.com>, Thomas Bennett
<thomas@ska.ac.za>, "cschollar@ska.ac.za" <cschollar@ska.ac.za>
Subject: Re: Remote data transfer

>Thanks Chris.
>
>Just to add to the conversation - what protocols are currently supported?
>
>I've seen scp, FTP and http. Also Amazon S3?
>
>On Monday, August 18, 2014, Mattmann, Chris A (3980) <
>chris.a.mattmann@jpl.nasa.gov> wrote:
>
>> Hi Etienne,
>>
>> Thanks. The Push Pull system is a way to pull down remote or ancillary
>> files usually *ahead* of file manager ingestion, since the crawler
>> really doesn't have a protocol layer to mitigate remote content.
>> The typical use case if you use Push Pull is:
>>
>> 1. Model remote/ancillary files on other sites
>> 2. Download them with push pull into a "staging area"
>> 3. Crawl and ingest with crawler, as if the content were
>> local to start out with.
>>
>> There is a Push Pull users guide here, it's a bit old but should
>> explain it:
>>
>>
>>http://svn.apache.org/repos/asf/oodt/trunk/pushpull/src/main/resources/do
>>cu
>> mentation/
>>
>>
>> Cheers,
>> Chris
>>
>>
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> Chris Mattmann, Ph.D.
>> Chief Architect
>> Instrument Software and Science Data Systems Section (398)
>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>> Office: 168-519, Mailstop: 168-527
>> Email: chris.a.mattmann@nasa.gov <javascript:;>
>> WWW:  http://sunset.usc.edu/~mattmann/
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> Adjunct Associate Professor, Computer Science Department
>> University of Southern California, Los Angeles, CA 90089 USA
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>
>>
>>
>>
>>
>>
>> -----Original Message-----
>> From: Etienne Koen <etiennek@scs-space.com <javascript:;>>
>> Date: Monday, August 18, 2014 2:36 AM
>> To: Thomas Bennett <thomas@ska.ac.za <javascript:;>>
>> Cc: Chris Mattmann <Chris.A.Mattmann@jpl.nasa.gov <javascript:;>>, "
>> cschollar@ska.ac.za <javascript:;>"
>> <cschollar@ska.ac.za <javascript:;>>, "dev@oodt.apache.org
>><javascript:;>"
>> <dev@oodt.apache.org <javascript:;>>
>> Subject: RE: Remote data transfer
>>
>> >Hi Tomas and all,
>> >
>> >I came across the push/pull tutorial on
>> >
>>
>>https://cwiki.apache.org/confluence/display/OODT/OODT+Push-Pull+User+Guid
>>e
>> >.
>> >
>> >Would this guide be more appropriate to download files that have  been
>> >archived by the file manager and represent a typical user scenario?
>> >
>> >Regards
>> >Etienne
>> >________________________________________
>> >From: Thomas Bennett [thomas@ska.ac.za <javascript:;>]
>> >Sent: Friday, August 15, 2014 9:54 AM
>> >To: Etienne Koen
>> >Cc: Mattmann, Chris A (3980); cschollar@ska.ac.za <javascript:;>;
>> dev@oodt.apache.org <javascript:;>
>> >Subject: Re: Remote data transfer
>> >
>> >Hi Etienne,
>> >
>> >There are various methods you can use to download the data.
>> >
>> >See this page:
>> >
>>
>>https://cwiki.apache.org/confluence/display/OODT/Getting+products+from+a+
>>r
>> >emote+FileManager
>> >
>> >Recently there is some great work that has been done on using a REST
>>API
>> >- this exists on svn trunk. I don't think it has been released yet.
>> >
>> >https://cwiki.apache.org/confluence/display/OODT/File+Manager+REST+API
>> >
>> >To use these components you will need to deploy tomcat or jetty.
>> >
>> >Shout if you need some help.
>> >
>> >Cheers,
>> >Tom
>> >
>> >
>> >
>> >
>> >On Thu, Aug 14, 2014 at 4:31 PM, Etienne Koen
>> ><etiennek@scs-space.com <javascript:;><mailto:etiennek@scs-space.com
>> <javascript:;>>> wrote:
>> >Hi Chris and Tom,
>> >
>> >As I have mentioned before in my previous email, I have managed to
>>ingest
>> >a file to a remote location using the filemgr-client. I am also able to
>> >query the information remotely using for example the query_tool in this
>> >way:
>> >
>> >$ ./query_tool --url http://192.168.0.10:9000 --lucene -query
>> >'CAS.ProductName:blah.txt'
>> >
>> >978ca28e-23b0-11e4-87fb-4f1c29029486
>> >
>> >What component would I use for searching and downloading the actual
>> >product from the remote file manager? Is the filemgr-client or
>>query_tool
>> >capable of doing this?
>> >
>> >Are there any tutorials you would recommend?
>> >
>> >Thanks
>> >Etienne
>> >
>> >________________________________________
>> >From: Mattmann, Chris A (3980)
>> >[chris.a.mattmann@jpl.nasa.gov <javascript:;><mailto:
>> chris.a.mattmann@jpl.nasa.gov <javascript:;>>]
>> >Sent: Wednesday, August 13, 2014 6:04 PM
>> >To: Etienne Koen; Thomas Bennett
>> >Cc: cschollar@ska.ac.za <javascript:;><mailto:cschollar@ska.ac.za
>> <javascript:;>>;
>> >dev@oodt.apache.org <javascript:;><mailto:dev@oodt.apache.org
>> <javascript:;>>; Mattmann, Chris A (3980)
>> >Subject: Re: Remote data transfer
>> >
>> >Thanks guys.
>> >
>> >Etienne, I hope you don't mind but I've copied
>> >dev@oodt.apache.org <javascript:;><mailto:dev@oodt.apache.org
>> <javascript:;>>
>> >
>> >on this email. That way you can tap into the entire Apache OODT
>> >community for help.
>> >
>> >The URI has authority component is usually an error indicating
>> >that you have referenced some environment variable in your config
>> >(e.g., filemgr.properties in the etc directory) but that variable
>> >isn't defined. E.g., maybe you have a *.policy.dirs property set
>> >to file://[SOME_UNDEFINED_VARIABLE]/path/dir/ and
>>SOME_UNDEFINED_VARIABLE
>> >is undefined.
>> >
>> >Can you check that to see if that's the root cause of this issue?
>> >
>> >Cheers,
>> >Chris
>> >
>> >------------------------
>> >Chris Mattmann
>> >chris.mattmann@gmail.com <javascript:;><mailto:chris.mattmann@gmail.com
>> <javascript:;>>
>> >
>> >
>> >
>> >
>> >-----Original Message-----
>> >From: Etienne Koen <etiennek@scs-space.com <javascript:;><mailto:
>> etiennek@scs-space.com <javascript:;>>>
>> >Date: Wednesday, August 13, 2014 1:42 AM
>> >To: Thomas Bennett <thomas@ska.ac.za <javascript:;><mailto:
>> thomas@ska.ac.za <javascript:;>>>
>> >Cc: "cschollar@ska.ac.za <javascript:;><mailto:cschollar@ska.ac.za
>> <javascript:;>>"
>> ><cschollar@ska.ac.za <javascript:;><mailto:cschollar@ska.ac.za
>> <javascript:;>>>, Chris Mattmann
>> ><chris.mattmann@gmail.com
>><javascript:;><mailto:chris.mattmann@gmail.com
>> <javascript:;>>>
>> >Subject: RE: Remote data transfer
>> >
>> >>Hi Tom,
>> >>
>> >>I get the following error when using the argument:
>> >>
>> >>ERROR: Failed to ingest product 'blah.txt' : URI has an authority
>> >>component
>> >>
>> >>Here both the server and client were using port 9000
>> >>
>> >>I get this when both the server and client are running on the same
>>port
>> >>
>> >>When communicating on different ports I get:
>> >>
>> >><-- some I/O / HTTP exceptions -->
>> >>...
>> >>...
>> >>
>> >>ERROR: Failed to ingest product 'blah.txt' : Connection refused
>> >>
>> >>Server:9000 and Client:431
>> >>
>> >>Do you know what any of this mean?
>> >>
>> >>Cheers
>> >>Etienne
>> >>
>> >>________________________________________
>> >>From: Thomas Bennett [thomas@ska.ac.za <javascript:;><mailto:
>> thomas@ska.ac.za <javascript:;>>]
>> >>Sent: Wednesday, August 13, 2014 10:02 AM
>> >>To: Etienne Koen
>> >>Cc: cschollar@ska.ac.za <javascript:;><mailto:cschollar@ska.ac.za
>> <javascript:;>>;
>> >>chris.mattmann@gmail.com
>><javascript:;><mailto:chris.mattmann@gmail.com
>> <javascript:;>>
>> >>Subject: Re: Remote data transfer
>> >>
>> >>Hey Etienne,
>> >>
>> >>I've been out of the office the last week but I'm back now.
>> >>
>> >>./filemgr-client --url http://localhost:9000 --operation
>>--ingestProduct
>> >>--productName blah.txt --productStructure Flat --productTypeName
>> >>GenericFile --metadataFile file:///tmp/blah.txt.met --refs
>> >>file:///tmp/blah.txt
>> >>
>> >>How would this line be modified to achieve what I want to do? I see
>>there
>> >>is also an argument --clientTransfer --dataTransfer but I am not sure
>> >>what java class to use for this?
>> >>
>> >>You will need to specify the filemgr remotely ie: --url
>> >>http://192.168.0.1 - are you doing this?
>> >>
>> >>I've done remote file transfer before I'll see if I can remember how
>>to
>> >>do it.
>> >>
>> >>Can I log into the CHPC with the usual credentials?
>> >>
>> >>Cheers,
>> >>Tom
>> >>--
>> >>Thomas Bennett
>> >>
>> >>SKA South Africa
>> >>Science Processing Team
>> >>
>> >>Office: +27 21 5067341<tel:%2B27%2021%205067341>
>> >>Mobile: +27 79 5237105<tel:%2B27%2079%205237105>
>> >>
>> >>________________________________
>> >>Disclaimer: This E-mail message, including any attachments, is
>>intended
>> >>only for the person or entity to which it is addressed, and may
>>contain
>> >>confidential information. Each page attached hereto must also be read
>>in
>> >>conjunction with this disclaimer.
>> >>If you are not the intended recipient you are hereby notified that any
>> >>disclosure, copying, distribution or reliance upon the contents of
>>this
>> >>e-mail is strictly prohibited. E.&O.E.
>> >>
>> >>Disclaimer: This E-mail message, including any attachments, is
>>intended
>> >>only for the  person or entity to which it is addressed, and may
>>contain
>> >>confidential  information. Each page attached hereto must also be
>>read in
>> >>conjunction with this disclaimer.
>> >>If you are not the intended recipient you are hereby notified that any
>> >>disclosure, copying, distribution or reliance upon the contents of
>>this
>> >>e-mail is strictly prohibited.    E.&O.E.
>> >
>> >
>> >Disclaimer: This E-mail message, including any attachments, is intended
>> >only for the  person or entity to which it is addressed, and may
>>contain
>> >confidential  information. Each page attached hereto must also be read
>>in
>> >conjunction with this disclaimer.
>> >If you are not the intended recipient you are hereby notified that any
>> >disclosure, copying, distribution or reliance upon the contents of this
>> >e-mail is strictly prohibited.    E.&O.E.
>> >
>> >Disclaimer: This E-mail message, including any attachments, is intended
>> >only for the  person or entity to which it is addressed, and may
>>contain
>> >confidential  information. Each page attached hereto must also be read
>>in
>> >conjunction with this disclaimer.
>> >If you are not the intended recipient you are hereby notified that any
>> >disclosure, copying, distribution or reliance upon the contents of this
>> >e-mail is strictly prohibited.    E.&O.E.
>> >
>> >
>> >
>> >--
>> >Thomas Bennett
>> >
>> >SKA South Africa
>> >Science Processing Team
>> >
>> >Office: +27 21 5067341
>> >Mobile: +27 79 5237105
>> >
>> >________________________________
>> >Disclaimer: This E-mail message, including any attachments, is intended
>> >only for the person or entity to which it is addressed, and may contain
>> >confidential information. Each page attached hereto must also be read
>>in
>> >conjunction with this disclaimer.
>> >If you are not the intended recipient you are hereby notified that any
>> >disclosure, copying, distribution or reliance upon the contents of this
>> >e-mail is strictly prohibited. E.&O.E.
>> >
>> >Disclaimer: This E-mail message, including any attachments, is intended
>> >only for the  person or entity to which it is addressed, and may
>>contain
>> >confidential  information. Each page attached hereto must also be read
>>in
>> >conjunction with this disclaimer.
>> >If you are not the intended recipient you are hereby notified that any
>> >disclosure, copying, distribution or reliance upon the contents of this
>> >e-mail is strictly prohibited.    E.&O.E.
>>
>>


Disclaimer: This E-mail message, including any attachments, is intended only for the  person
or entity to which it is addressed, and may contain confidential  information. Each page attached
hereto must also be read in conjunction with this disclaimer.
If you are not the intended recipient you are hereby notified that any disclosure, copying,
distribution or reliance upon the contents of this e-mail is strictly prohibited.    E.&O.E.

Disclaimer: This E-mail message, including any attachments, is intended only for the  person
or entity to which it is addressed, and may contain confidential  information. Each page attached
hereto must also be read in conjunction with this disclaimer.
If you are not the intended recipient you are hereby notified that any disclosure, copying,
distribution or reliance upon the contents of this e-mail is strictly prohibited.    E.&O.E.

Mime
View raw message