oodt-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mattmann, Chris A (388J)" <chris.a.mattm...@jpl.nasa.gov>
Subject Re: PushPull framework and custom met extraction
Date Sat, 10 Nov 2012 21:23:19 GMT
+1...

Cheers,
Chris

On Nov 10, 2012, at 9:07 AM, Brian Foster wrote:

> Hey Rishi,
> 
> The filemgr connection from the pushpull is just to verify if the filemgr already has
a file, so the pushpull doesn't redownload files (no ingest support)... usually you configure
your pushpull deamon to run at longer interval times, but the crawler usually will wake up
more often (every 30 seconds is a typical interval time for it)... so just have the pushpull
download its files to a staging area which is the same directory which the crawler is monitoring.
> 
> -brian
> 
> On Nov 09, 2012, at 11:06 AM, "Verma, Rishi (388J)" <Rishi.Verma@jpl.nasa.gov>
wrote:
> 
>> Hey Brian, Shreyl,
>> 
>> Thanks for your input and clarification on this.
>> 
>> Brian - the delegation of duties you described makes sense. Does cas-puspull have
any way to invoke a local crawl process following completion of downloads? I know it has a
filemgr hookup, but I wonder about whether a crawl process can be invoked following the completion
of all file downloads via pushpull. The alternative way of doing this could, of course, be
to schedule the crawler deamon to run well after the pushpull deamon finishes its work.
>> 
>> Thanks to both of you for your help!
>> rishi
>> 
>> On Nov 9, 2012, at 10:08 AM, Brian Foster wrote:
>> 
>>> 
>>> Hey Rishi,
>>> 
>>> You will need to use both cas-pushpull and cas-crawler to accomplish this...
>>> 
>>> cas-pushpull: Used to for downloading files from remote sites to you local systems...
the .tmp files contain cas-pushpull's known metadata and you can configure which of the known
metadata gets written out or if a .tmp file gets created at all... however you can add custom
metadata fields to it.
>>> 
>>> cas-crawler: Allows for metadata extraction (custom metadata) from files on your
local system... and then allows you to ingest them into the filemgr (optionally can be turned
off)
>>> 
>>> HTH
>>> -brian
>>> 
>>> On Nov 08, 2012, at 06:11 PM, "Verma, Rishi (388J)" <Rishi.Verma@jpl.nasa.gov>
wrote:
>>> 
>>>> Hi All -
>>>> 
>>>> I'm wondering if anyone has experience with, or knows the details of how
to use custom MetExtractors on products that are remotely downloaded via PushPull. 
>>>> 
>>>> By default, PushPull performs some basic met-extraction and creates a ".tmp"
file associated with downloaded products, but I'm wondering whether this met generation step
is customizable.
>>>> 
>>>> I've looked through the configuration files (e.g. [1], [2]) as well as the
code for PushPull, but I can't seem to locate configuration parameters to support the invocation
of custom met extractors on downloaded data.
>>>> 
>>>> If any of you have experience with this, or can point me on where to look,
I'd really appreciate it.
>>>> 
>>>> Thanks! 
>>>> Rishi 
>>>> 
>>>> --
>>>> [1] http://svn.apache.org/repos/asf/oodt/trunk/pushpull/src/main/resources/push_pull_framework.properties
>>>>  
>>>> [2] http://svn.apache.org/repos/asf/oodt/trunk/pushpull/src/main/resources/examples/
>> 


Mime
View raw message