oodt-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Mattmann <chris.mattm...@gmail.com>
Subject Re: How to process files in a sorted order
Date Tue, 19 Nov 2013 18:39:37 GMT
Hey Konstantinos,

The best way to take care of this is, as Cam said, to sub-class the
crawler, and define a new FileFilter (note there are default ones defined
in ProductCrawler). So you could create e.g., a:

SortedFilesCrawler extends ProductCrawler{
   // new FileFilter defined here
   // override crawl methods
   @Override
   public void crawl(){ // use your filter}
    
   @Override
   public void crawl(File dirRoot) {//use your filter}

}

Hope that helps!

Cheers,
Chris

------------------------
Chris Mattmann
chris.mattmann@gmail.com




-----Original Message-----
From: Konstantinos Mavrommatis <kmavrommatis@celgene.com>
Reply-To: <dev@oodt.apache.org>
Date: Thursday, November 7, 2013 8:44 PM
To: "dev@oodt.apache.org" <dev@oodt.apache.org>
Subject: How to process files in a sorted order

>Hi,
>In my environment I am using cas-crawler to process directories of 1000s
>of files. The metadata for these files are extracted automatically using
>the mimetypes definitions and small wrapper scripts.
>In these directories some of the files are derived from other files and
>metadata from the older files need to be transferred to the newer file.
>In order to achieve this I need to have the files processed by the
>cas-crawler starting from the older file to the newer file or in other
>cases in alphabetical order..
>Any ideas how this can be achieved?
>
>The crawler command I currently use is:
>./crawler_launcher --operation --launchAutoCrawler --productPath
>$FILEPATH --filemgrUrl $FMURL --clientTransferer
>org.apache.oodt.cas.filemgr.datatransfer.InPlaceDataTransferFactory
>--mimeExtr
>actorRepo ../policy/mime-extractor-map.xml
>
>Thanks in advance for your help
>Konstantinos
>
>*********************************************************
>THIS ELECTRONIC MAIL MESSAGE AND ANY ATTACHMENT IS
>CONFIDENTIAL AND MAY CONTAIN LEGALLY PRIVILEGED
>INFORMATION INTENDED ONLY FOR THE USE OF THE INDIVIDUAL
>OR INDIVIDUALS NAMED ABOVE.
>If the reader is not the intended recipient, or the
>employee or agent responsible to deliver it to the
>intended recipient, you are hereby notified that any
>dissemination, distribution or copying of this
>communication is strictly prohibited. If you have
>received this communication in error, please reply to the
>sender to notify us of the error and delete the original
>message. Thank You.
>********************************************************* 



Mime
View raw message