Return-Path: X-Original-To: apmail-nutch-user-archive@www.apache.org Delivered-To: apmail-nutch-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B22D0931E for ; Sat, 10 Dec 2011 13:55:34 +0000 (UTC) Received: (qmail 89241 invoked by uid 500); 10 Dec 2011 13:55:33 -0000 Delivered-To: apmail-nutch-user-archive@nutch.apache.org Received: (qmail 89181 invoked by uid 500); 10 Dec 2011 13:55:32 -0000 Mailing-List: contact user-help@nutch.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@nutch.apache.org Delivered-To: mailing list user@nutch.apache.org Received: (qmail 89173 invoked by uid 99); 10 Dec 2011 13:55:32 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 10 Dec 2011 13:55:32 +0000 X-ASF-Spam-Status: No, hits=2.9 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [67.18.72.139] (HELO gateway11.websitewelcome.com) (67.18.72.139) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 10 Dec 2011 13:55:23 +0000 Received: by gateway11.websitewelcome.com (Postfix, from userid 5011) id EC13672941F7E; Sat, 10 Dec 2011 07:55:01 -0600 (CST) Received: from gator279.hostgator.com (gator279.hostgator.com [184.172.187.250]) by gateway11.websitewelcome.com (Postfix) with ESMTP id DE30572941F41 for ; Sat, 10 Dec 2011 07:55:01 -0600 (CST) Received: from [209.85.212.54] (port=37769 helo=mail-vw0-f54.google.com) by gator279.hostgator.com with esmtpsa (TLSv1:RC4-SHA:128) (Exim 4.69) (envelope-from ) id 1RZNOH-0007my-NV for user@nutch.apache.org; Sat, 10 Dec 2011 07:55:01 -0600 Received: by vbbfr13 with SMTP id fr13so5067478vbb.27 for ; Sat, 10 Dec 2011 05:55:02 -0800 (PST) Received: by 10.52.35.177 with SMTP id i17mr6731936vdj.21.1323525302245; Sat, 10 Dec 2011 05:55:02 -0800 (PST) MIME-Version: 1.0 Received: by 10.52.114.135 with HTTP; Sat, 10 Dec 2011 05:54:41 -0800 (PST) In-Reply-To: References: From: "M.Rizwan" Date: Sat, 10 Dec 2011 18:54:41 +0500 Message-ID: Subject: Re: Exception org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: file:/home/nutch/1.4/runtime/local/crawl/segments/20111209174842/parse_data To: user@nutch.apache.org Content-Type: multipart/alternative; boundary=20cf307ca4447dd97804b3bd3fd1 X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - gator279.hostgator.com X-AntiAbuse: Original Domain - nutch.apache.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - sigmatec.com.pk X-BWhitelist: no X-Source: X-Source-Args: X-Source-Dir: X-Source-Sender: mail-vw0-f54.google.com [209.85.212.54]:37769 X-Source-Auth: muhammad.rizwan@sigmatec.com.pk X-Email-Count: 4 X-Source-Cap: c2lnbWF0ZWM7c2lnbWF0ZWM7Z2F0b3IyNzkuaG9zdGdhdG9yLmNvbQ== --20cf307ca4447dd97804b3bd3fd1 Content-Type: text/plain; charset=UTF-8 Thanks Rami. Yes not a good solution but this worked for me too. Thanks for sharing. On Fri, Dec 9, 2011 at 5:13 PM, remi tassing wrote: > Sorry, I forgot to change the title... > > However I had the same error "Exception > org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: > file:/home/nutch/1.4/runtime/local/crawl/segments/..." this morning. > > I believe it's because I stopped Nutch while it was crawling and data were > not saved properly. > > I couldn't find an alternative and just had to delete my "crawl" folder, > then it worked...Not a good solution! > > On Fri, Dec 9, 2011 at 2:08 PM, Lewis John Mcgibbney < > lewis.mcgibbney@gmail.com> wrote: > > > Hi Remi, > > > > Please don't hijack someone's thread, start your own. > > > > Thank you > > > > Lewis > > > > On Fri, Dec 9, 2011 at 8:26 AM, remi tassing > > wrote: > > > > > Hello guys, > > > > > > how do you use "org.apache.nutch.net.URLFilterChecker"? It's not > > documented > > > and it always shows me this "Checking combination of all URLFilters > > > available" and then gets stuck. > > > > > > Remi > > > > > > > > > > > -- > > *Lewis* > > > > > > -- > Remi Tassing > --20cf307ca4447dd97804b3bd3fd1--