oodt-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stough, Timothy M (388F)" <timothy.m.sto...@jpl.nasa.gov>
Subject Re: crawler with unique action fails on first ingest
Date Mon, 21 Nov 2011 17:23:34 GMT
I ran into this too.  My solution was to add the ingest of blah.txt (blatantly stolen from
the quick-start) to my start-up scripts and just leave it in the catalog.

What are fmquery and fmdel?  I was wondering how to remove something from the catalog.

Thanks,
Tim.


On Nov 18, 2011, at 4:55 PM, Mattmann, Chris A (388J) wrote:

> Hey Ricky,
> 
> I've ran into this a number of times myself and recently Paul Ramirez and I were talking
about this too. Paul even 
> said he would try and fix it (ha! I'm signing him up for work :P ). Actually I'll just
look at it myself.
> 
> In the meanwhile, the workaround is exactly the one you stated. Ingest a file, that gets
you a catalog. Then, you can 
> simply delete the file if you want using fmquery | fmdel and then Unique works just fine.
> 
> Cheers,
> Chris
> 
> On Nov 18, 2011, at 4:52 PM, Nguyen, Ricky wrote:
> 
>> Hi,
>> 
>> I am trying to run a crawler using "--actionIds Unique". Since this is the first
time I am ingesting a file into FileMgr, the user guide [1] says that the catalog dir MUST
NOT exist so that Lucene can create it. However, the crawler fails with the error:
>> 
>> IOException when opening index directory: [/Users/rnguyen/vpicu/data/catalog] for
search: Message: /Users/rnguyen/vpicu/data/catalog is not a directory
>> 
>> Seems like crawler is trying to search for a product (to determine it's uniqueness),
but the catalog hasn't been created yet. I guess since I have no catalog, the workaround is
to omit the "Unique" action.
>> 
>> But if I use crawler as a daemon, it would be useful to leave "Unique" as an action.
Any thoughts on the right course?
>> 
>> Thanks,
>> Ricky
>> 
>> [1] http://oodt.apache.org/components/maven/filemgr/user/basic.html
>> 
>> 
>> ---------------------------------------------------------------------
>> CONFIDENTIALITY NOTICE: This e-mail message, including any attachments, 
>> is for the sole use of the intended recipient(s) and may contain confidential
>> or legally privileged information. Any unauthorized review, use, disclosure
>> or distribution is prohibited. If you are not the intended recipient, please
>> contact the sender by reply e-mail and destroy all copies of this original message.
 
>> 
>> ---------------------------------------------------------------------
>> 
> 
> 
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Chris Mattmann, Ph.D.
> Senior Computer Scientist
> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> Office: 171-266B, Mailstop: 171-246
> Email: chris.a.mattmann@nasa.gov
> WWW:   http://sunset.usc.edu/~mattmann/
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Adjunct Assistant Professor, Computer Science Department
> University of Southern California, Los Angeles, CA 90089 USA
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> 

-----------------------------------------------------------------
Tim Stough
NASA/Caltech Jet Propulsion Lab
Senior System Architect
Data Understanding Group (Section 388)
818-393-5347 (office)
626-644-6574 (cell)
-----------------------------------------------------------------






Mime
View raw message