oodt-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From YunHee Kang <yunh.k...@gmail.com>
Subject Re: Question about metadata specification for Filemgr and Pushpull
Date Fri, 03 Aug 2012 01:31:53 GMT
Hi Brian,

I overlooked the function of the CAS crawler Framework  to ingest
download files into the CAS File Manager.   I appreciate your
explanation about the relationship between Pushpull framework and
Crawler Framework.  To be honest, I was confused why you only modified
two source programs in the patch. Now it  helps to clearfy my
thinking.

Thanks,
Yunhee

2012/8/3 Brian Foster <holenoter@me.com>:
> Hey YunHee,
>
> This is expected behavoir now... pushpull only downloads the files... it
> talks to the filemgr to try to determine if it already has a file before it
> redownloads it... if you want automated file ingest, you must set up a
> crawler to crawl your pushpull staging directory... there is documentation
> for that as well... http://oodt.apache.org/components/maven/crawler/user/
> ... the command line information is out of date there possibly, however just
> run: ./crawler_launcher --help ... the help menus are pretty involved... you
> can also learn how to customize your command-line options here:
> https://svn.apache.org/repos/asf/oodt/trunk/cli/README.txt
>
> -brian
>
>
> On Aug 02, 2012, at 10:08 AM, Kang YunHee <yunh.kang@gmail.com> wrote:
>
> Hi Brain,
>
> I applied the patch(OODT-481.2012-08-01.txt) to the source of pushpull
> framework, rebuilt it and swapped its jar out in my deployment according
> to the following steps:
>
> patch -p0 < OODT-481.2012-08-01.txt
> mvn clean
> mvn install
> cd target
> cp cas-pushpull-0.5-SNAPSHOT.jar ~/oodt-0.5/cas-pushpull/lib/
>
>
>
>
> After I ran the pushpull script, I found out there is no "Catalog
> exception" as you can see:
> Aug 3, 2012 1:31:25 AM
> org.apache.oodt.cas.pushpull.retrievalsystem.FileRetrievalSystem
> addToDownloadQueue
> WARNING: Skipping file {parent = 'null', path =
> '/TES/TL2CO2N.005/2004.09.20/TES-Aura_L2-CO2-Nadir_r0000002147_F06_09.he5',
> isDir = 'false'} because it is already in staging area
> PageSize: 8 PageLoc: 2
> FileList size: 2
> PageSize: 8 PageLoc: 952
> FileList size: 952
> Aug 3, 2012 1:35:38 AM
> org.apache.oodt.cas.pushpull.protocol.ProtocolHandler disconnect
> INFO: Disconnecting protocol
> org.apache.oodt.cas.protocol.ftp.CommonsNetFtpProtocol
> Aug 3, 2012 1:35:38 AM org.apache.oodt.cas.pushpull.daemon.Daemon$1 run
> INFO: Daemon with ID = 90121 on RMI registry port 9012 is shutting down
>
>
> But I am not sure that the file downloaded is ingested by my filemgr.
> I think if it was ingested by my filemgr, it would be in the repository
> directory.
> However I did not find it in my repository directory:
>
> Let me know how to check the status of cataloging mentioned above.
>
> Thanks,
> Yunhee
>
> On 8/2/12 6:29 AM, "Brian Foster" <holenoter@me.com> wrote:
>
>>
>>hey YunHee,
>>
>>I've submitted the patch, so you can instead of having to patch the code
>>you can just resync your pushpull code and rebuild it and swap the jar
>>out in your deployment
>>
>>-brian
>>
>>On Aug 1, 2012, at 8:28 AM, YunHee Kang wrote:
>>
>>> Hi Chris and Brian,
>>>
>>> I am reading source codes for handling "Catalog exception" related the
>>> runtime error described below.
>>> org.apache.oodt.cas.filemgr.structs.exceptions.CatalogException:
>>> Failure writing request
>>> at
>>>org.apache.oodt.cas.filemgr.system.XmlRpcFileManagerClient.hasProduct(Xml
>>>RpcFileManagerClient.java:606)
>>> at
>>>org.apache.oodt.cas.filemgr.ingest.StdIngester.hasProduct(StdIngester.jav
>>>a:284)
>>> at
>>>org.apache.oodt.cas.pushpull.retrievalsystem.FileRetrievalSystem.isAlread
>>>yInDatabase(FileRetrievalSystem.java:254)
>>> at
>>>org.apache.oodt.cas.pushpull.retrievalsystem.FileRetrievalSystem.addToDow
>>>nloadQueue(FileRetrievalSystem.java:463)
>>> at
>>>org.apache.oodt.cas.pushpull.retrievalmethod.RemoteCrawler.processPropFil
>>>e(RemoteCrawler.java:138)
>>> at
>>>org.apache.oodt.cas.pushpull.retrievalsystem.RetrievalSetup.retrieveFiles
>>>(RetrievalSetup.java:109)
>>> at org.apache.oodt.cas.pushpull.daemon.Daemon$1.run(Daemon.java:218)
>>> at java.lang.Thread.run(Thread.java:662)
>>>
>>> I think that the exception was caused by the wrong value of the
>>> parameter productName of the method hasProduct() in the following
>>> codelet of XmlRpcFileManagerClient.java:
>>> at
>>>org.apache.oodt.cas.filemgr.system.XmlRpcFileManagerClient.hasProduct(Xml
>>>RpcFileManagerClient.java:606)
>>> public boolean hasProduct(String productName) throws CatalogException {
>>> Vector<Object> argList = new Vector<Object>();
>>> argList.add(productName);
>>>
>>> boolean hasProduct = false;
>>>
>>> try {
>>> hasProduct = ((Boolean) client.execute("filemgr.hasProduct",
>>> argList)).booleanValue();
>>> } catch (XmlRpcException e) {
>>> throw new CatalogException(e.getMessage());
>>> } catch (IOException e) {
>>> throw new CatalogException(e.getMessage());
>>> }
>>> return hasProduct;
>>> }
>>>
>>> I inspected that the element "mine-type" described below was added
>>> in the file mimestypes.xml that is one of pushpull property files.
>>> <mime-type type="product/tes">
>>> <_comment>ProductType=MyTesProductType</_comment>
>>> <glob
>>> pattern="TES-Aura_L2-CO2-Nadir_r\d{10}\w{2}\d{2}\w\d{2}\.he5"
>>> isregex="true"/>
>>> </mime-type>
>>>
>>> I would like to know what the element "mine-type" means.
>>> I am wondering how I can check the value of productName in the
>>> properties of Filemgr and Pushpull.
>>>
>>> I am sorry to bother you again.
>>>
>>> Thanks,
>>> Yunhee
>>
>
>

Mime
View raw message