Return-Path: X-Original-To: apmail-oodt-user-archive@minotaur.apache.org Delivered-To: apmail-oodt-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 9175199B0 for ; Sat, 19 Nov 2011 00:55:51 +0000 (UTC) Received: (qmail 95107 invoked by uid 500); 19 Nov 2011 00:55:51 -0000 Delivered-To: apmail-oodt-user-archive@oodt.apache.org Received: (qmail 95074 invoked by uid 500); 19 Nov 2011 00:55:51 -0000 Mailing-List: contact user-help@oodt.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@oodt.apache.org Delivered-To: mailing list user@oodt.apache.org Received: (qmail 95066 invoked by uid 99); 19 Nov 2011 00:55:51 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 19 Nov 2011 00:55:51 +0000 X-ASF-Spam-Status: No, hits=-2.3 required=5.0 tests=RCVD_IN_DNSWL_MED,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [128.149.139.106] (HELO mail.jpl.nasa.gov) (128.149.139.106) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 19 Nov 2011 00:55:45 +0000 Received: from mail.jpl.nasa.gov (altvirehtstap01.jpl.nasa.gov [128.149.137.72]) by smtp.jpl.nasa.gov (Switch-3.4.3/Switch-3.4.3) with ESMTP id pAJ0t4PA026681 (using TLSv1/SSLv3 with cipher RC4-MD5 (128 bits) verified NO) for ; Fri, 18 Nov 2011 16:55:23 -0800 Received: from ALTPHYEMBEVSP20.RES.AD.JPL ([128.149.137.83]) by ALTVIREHTSTAP01.RES.AD.JPL ([128.149.137.72]) with mapi; Fri, 18 Nov 2011 16:55:06 -0800 From: "Mattmann, Chris A (388J)" To: "user@oodt.apache.org" Date: Fri, 18 Nov 2011 16:55:32 -0800 Subject: Re: crawler with unique action fails on first ingest Thread-Topic: crawler with unique action fails on first ingest Thread-Index: AcymVeC/qliPQPzLS8WiG8lHtIzbUg== Message-ID: <1A9BA495-E43D-41CD-9F55-36E1A61FBA16@jpl.nasa.gov> References: <3CC10DCC-E540-4E76-A38E-AB2259E0FD5E@chla.usc.edu> In-Reply-To: <3CC10DCC-E540-4E76-A38E-AB2259E0FD5E@chla.usc.edu> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-Source-IP: altvirehtstap01.jpl.nasa.gov [128.149.137.72] X-Source-Sender: chris.a.mattmann@jpl.nasa.gov X-AUTH: Authorized Hey Ricky, I've ran into this a number of times myself and recently Paul Ramirez and I= were talking about this too. Paul even=20 said he would try and fix it (ha! I'm signing him up for work :P ). Actuall= y I'll just look at it myself. In the meanwhile, the workaround is exactly the one you stated. Ingest a fi= le, that gets you a catalog. Then, you can=20 simply delete the file if you want using fmquery | fmdel and then Unique wo= rks just fine. Cheers, Chris On Nov 18, 2011, at 4:52 PM, Nguyen, Ricky wrote: > Hi, >=20 > I am trying to run a crawler using "--actionIds Unique". Since this is th= e first time I am ingesting a file into FileMgr, the user guide [1] says th= at the catalog dir MUST NOT exist so that Lucene can create it. However, th= e crawler fails with the error: >=20 > IOException when opening index directory: [/Users/rnguyen/vpicu/data/cata= log] for search: Message: /Users/rnguyen/vpicu/data/catalog is not a direct= ory >=20 > Seems like crawler is trying to search for a product (to determine it's u= niqueness), but the catalog hasn't been created yet. I guess since I have n= o catalog, the workaround is to omit the "Unique" action. >=20 > But if I use crawler as a daemon, it would be useful to leave "Unique" as= an action. Any thoughts on the right course? >=20 > Thanks, > Ricky >=20 > [1] http://oodt.apache.org/components/maven/filemgr/user/basic.html >=20 >=20 > --------------------------------------------------------------------- > CONFIDENTIALITY NOTICE: This e-mail message, including any attachments,=20 > is for the sole use of the intended recipient(s) and may contain confiden= tial > or legally privileged information. Any unauthorized review, use, disclosu= re > or distribution is prohibited. If you are not the intended recipient, ple= ase > contact the sender by reply e-mail and destroy all copies of this origina= l message. =20 >=20 > --------------------------------------------------------------------- >=20 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: chris.a.mattmann@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++