oodt-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From YunHee Kang <yunh.k...@gmail.com>
Subject Re: Problem happened when I tried to run the script "crawler_launcher"
Date Thu, 09 Aug 2012 13:27:53 GMT
Hi Chris,

I got a bunch of error messages when running the crawler_launcher script.
First off, I think I need to understand  how to a crawler works.
Can I get some materials to help me write configuration files for
crawler_launcher ?

Honestly I am not familiar with Crawler.
But I will try to file a JIRA issue to update the Crawler user guide.

Thanks,
Yunhee



2012/8/9 Mattmann, Chris A (388J) <chris.a.mattmann@jpl.nasa.gov>:
> Hi YunHee,
>
> Sorry, we need to update the docs, that is for sure. Can you help
> us remember by filing a JIRA issue to update the Crawler user
> guide and to fix the URL there?
>
> As for crawlerId, yes it's obsolete, you can find the modern
> 0.4 and 0.5-trunk options by running ./crawler_launcher -h
>
> Cheers,
> Chris
>
> On Aug 7, 2012, at 7:03 AM, YunHee Kang wrote:
>
>> Hi Chris and Sheryl,
>>
>> I understood  my mistake after modifying a wrong URL with the "/".
>> But there is the wrong  URL  that is used  as an option of
>> crawler_launcher in the apache oodt
>> homepage(http://oodt.apache.org/components/maven/crawler/user/).
>> --filemgrUrl http://localhost:9000/ \
>> So it made me confused.
>>
>> I tried to run the command mentioned below  according to  the home
>> page of apache oodt.
>> $ ./crawler_launcher --crawlerId MetExtractorProductCrawler
>> ERROR: Invalid option: 'crawlerId'
>>
>> But the error described above  was occurred.
>> Is the option 'crawlerid'  obsolete ?
>>
>> Thanks,
>> Yunhee
>>
>>
>> 2012/8/7 Mattmann, Chris A (388J) <chris.a.mattmann@jpl.nasa.gov>:
>>> Perfect, Sheryl, my thoughts exactly.
>>>
>>> Cheers,
>>> Chris
>>>
>>> On Aug 6, 2012, at 10:01 AM, Sheryl John wrote:
>>>
>>>> Hi Yunhee,
>>>>
>>>> Check out this OODT wiki for crawler :
>>>> https://cwiki.apache.org/confluence/display/OODT/OODT+Crawler+Help
>>>>
>>>> Did you try giving 'http://localhost:8000' without the "/" in the end?
>>>> Also, specify 'org.apache.oodt.cas.filemgr.datatransfer.LocalDataTransferFactory'
>>>> for  'clientTransferer' option.
>>>>
>>>>
>>>> On Mon, Aug 6, 2012 at 9:46 AM, YunHee Kang <yunh.kang@gmail.com> wrote:
>>>>
>>>>> Hi Chris,
>>>>>
>>>>> I got an error message when I tried to run crawler_launcher by using
a
>>>>> shell script. The error message may be caused by a  wrong URL of
>>>>> filemgr.
>>>>> $ ./crawler_launcher.sh
>>>>> ERROR: Validation Failures: - Value 'http://localhost:8000/' is not
>>>>> allowed for option
>>>>> [longOption='filemgrUrl',shortOption='fm',description='File Manager
>>>>> URL'] - Allowed values = [http://.*:\d*]
>>>>>
>>>>> The following is the shell script that I wrote:
>>>>> $ cat crawler_launcher.sh
>>>>> #!/bin/sh
>>>>> export STAGE_AREA=/home/yhkang/oodt-0.5/cas-pushpull/staging/TESL2CO2
>>>>> ./crawler_launcher \
>>>>>      -op --launchStdCrawler \
>>>>>      --productPath $STAGE_AREA\
>>>>>      --filemgrUrl http://localhost:8000/\
>>>>>      --failureDir /tmp \
>>>>>      --actionIds DeleteDataFile MoveDataFileToFailureDir Unique \
>>>>>      --metFileExtension tmp \
>>>>>      --clientTransferer
>>>>> org.apache.oodt.cas.filemgr.datatransfer.LocalDataTransferer
>>>>>
>>>>> I am wondering if there is a problem in the URL of the filemgr or elsewhere
>>>>>
>>>>> Thanks,
>>>>> Yunhee
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> -Sheryl
>>>
>>>
>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>> Chris Mattmann, Ph.D.
>>> Senior Computer Scientist
>>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>>> Office: 171-266B, Mailstop: 171-246
>>> Email: chris.a.mattmann@nasa.gov
>>> WWW:   http://sunset.usc.edu/~mattmann/
>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>> Adjunct Assistant Professor, Computer Science Department
>>> University of Southern California, Los Angeles, CA 90089 USA
>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>
>
>
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Chris Mattmann, Ph.D.
> Senior Computer Scientist
> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> Office: 171-266B, Mailstop: 171-246
> Email: chris.a.mattmann@nasa.gov
> WWW:   http://sunset.usc.edu/~mattmann/
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Adjunct Assistant Professor, Computer Science Department
> University of Southern California, Los Angeles, CA 90089 USA
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>

Mime
View raw message