Return-Path: Delivered-To: apmail-lucene-general-archive@www.apache.org Received: (qmail 35950 invoked from network); 8 Jun 2010 09:12:40 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 8 Jun 2010 09:12:40 -0000 Received: (qmail 90165 invoked by uid 500); 8 Jun 2010 09:12:40 -0000 Delivered-To: apmail-lucene-general-archive@lucene.apache.org Received: (qmail 90032 invoked by uid 500); 8 Jun 2010 09:12:38 -0000 Mailing-List: contact general-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: general@lucene.apache.org Delivered-To: mailing list general@lucene.apache.org Received: (qmail 90024 invoked by uid 99); 8 Jun 2010 09:12:37 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 08 Jun 2010 09:12:37 +0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of gsiasf@gmail.com designates 209.85.219.226 as permitted sender) Received: from [209.85.219.226] (HELO mail-ew0-f226.google.com) (209.85.219.226) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 08 Jun 2010 09:12:28 +0000 Received: by ewy26 with SMTP id 26so850323ewy.5 for ; Tue, 08 Jun 2010 02:12:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:sender:content-type :mime-version:subject:from:in-reply-to:date :content-transfer-encoding:message-id:references:to:x-mailer; bh=HuPS6o/A92s2uF+k/Uge64QCuTnEbX/AAVxKkuzYPvc=; b=exoBS49pkx8WCDq2tdKUzZ9ZZPUEMAwA0Cm1WXWZCYbehMpshZfpPZPjyEUKgBkl6P DjfvTB04QO/UYfP37qP0r6bQ4nkwrxKTFLTR/o7R8FXhzdJnGxqwcFR08W37lsaeLj0I QNkvWMLJLxxRWXnxdJz4Y9WbneAbxlj2V+0QA= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=sender:content-type:mime-version:subject:from:in-reply-to:date :content-transfer-encoding:message-id:references:to:x-mailer; b=I4dk01iKRJqWSNcvCpVOT9K/hd4hrbGI2mCeyGpZmbHcOBAyvyPktZPMLADLZjXdUM Sx3iLOM0BdL5KLjJv3nqD4p11vqZSJfGqzuFjbCIM3Cp/5JzbOxr8JrhtzQp9dak1Rbn TJ5Ft9AsrZkfRRlEmkiEfLN1l7VwOdUo2B8TA= Received: by 10.213.20.132 with SMTP id f4mr11935371ebb.92.1275988327788; Tue, 08 Jun 2010 02:12:07 -0700 (PDT) Received: from [172.16.31.12] (91-64-92-201-dynip.superkabel.de [91.64.92.201]) by mx.google.com with ESMTPS id 15sm3211541ewy.4.2010.06.08.02.12.06 (version=TLSv1/SSLv3 cipher=RC4-MD5); Tue, 08 Jun 2010 02:12:06 -0700 (PDT) Sender: Grant Ingersoll Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Apple Message framework v1078) Subject: Re: Your Help on Nutch! From: Grant Ingersoll In-Reply-To: Date: Tue, 8 Jun 2010 11:12:04 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: References: To: Lucene mailing list X-Mailer: Apple Mail (2.1078) X-Virus-Checked: Checked by ClamAV on apache.org Hi Tesfaye, Your question is best asked on user@nutch.apache.org, where Nutch issues = are discussed.. Thanks, Grant On Jun 8, 2010, at 8:28 AM, Tesfaye Guta wrote: > Hello all, > I am able to configure Nutch and use it on my PC. > I am working a thesis on a local search engine. > I hope in the way I understood Nutch, it is automatically indexing the > documents it has crawled. > I want to do some preprocessing on the documents cralwed before they = get > indexed. Can you help me > on how to go about? >=20 > Thank u in advance and hope to hear from you soon. > -Tesfaye