Return-Path: Delivered-To: apmail-nutch-dev-archive@www.apache.org Received: (qmail 2749 invoked from network); 1 Sep 2010 08:34:51 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 1 Sep 2010 08:34:51 -0000 Received: (qmail 44005 invoked by uid 500); 1 Sep 2010 08:34:50 -0000 Delivered-To: apmail-nutch-dev-archive@nutch.apache.org Received: (qmail 43845 invoked by uid 500); 1 Sep 2010 08:34:49 -0000 Mailing-List: contact dev-help@nutch.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@nutch.apache.org Delivered-To: mailing list dev@nutch.apache.org Received: (qmail 43838 invoked by uid 99); 1 Sep 2010 08:34:49 -0000 Received: from Unknown (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 01 Sep 2010 08:34:49 +0000 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of alex.mclintock@gmail.com designates 209.85.216.54 as permitted sender) Received: from [209.85.216.54] (HELO mail-qw0-f54.google.com) (209.85.216.54) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 01 Sep 2010 08:34:27 +0000 Received: by qwg5 with SMTP id 5so13686qwg.27 for ; Wed, 01 Sep 2010 01:32:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=nSpdzYDLs96Fh5JK+sVrsoOCWu5nC/72hbQGSlAMd1s=; b=w0VQiYqZTNZNPwlcQpW2cENTDOFdh4NhP7W+mmPC5MHJRYUOpWMdPyjA38PZwoDh2U sYOtK80lQkMg7dmz04M1x4sz8L614iLvPBH9hfF5SozzGpF346snEWT7nTQk7OXuzLPY J4/bXEoIDglzGBQCsQzzfqCc7RXG/hoVVLJp4= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=M1GupO5vGtOVyEk68540QRSBiPa7upqpm1PhTc5zXKaPiTXItzEpA0Hlc6bsFmWW75 q3MwfUCsKxmGqTrSNKbiiNjqE1gOteBJueNmG/WIV+p/4DtWIapxQ7xNO9XO/eQuPoqU NC5hcCZGAEkcJdr42CEuI/AsnWrvLGNVRnFbg= MIME-Version: 1.0 Received: by 10.224.119.20 with SMTP id x20mr4811138qaq.249.1283329800686; Wed, 01 Sep 2010 01:30:00 -0700 (PDT) Received: by 10.229.35.130 with HTTP; Wed, 1 Sep 2010 01:30:00 -0700 (PDT) In-Reply-To: References: Date: Wed, 1 Sep 2010 09:30:00 +0100 Message-ID: Subject: Re: crawling webpage results From: Alex McLintock To: dev@nutch.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org This should really be a user type question, not a dev question. But what the heck. The first thing which comes to mind is to do the search yourself and provide the results of that search as seed pages. But since you asked on the dev mailing list, you could possibly write something which actually queried Google itself through its API - but Nutch doesn't do that itself. If you do write it then consider submitting it as a patch. Goodluck Alex On 1 September 2010 09:14, Shanthoosh PV wrote: > Hi , > > =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 I want to crawl= a result obtained based upon=A0 a user > defined keyword search in a search engine . Is it possible to do it in nu= tch > . Please provide useful insights , i tried searching in this forum and > google but found nothing helpful . > > =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 The user may pr= ovide a search engine like google.com > along with keyword to search for in that search engine . The results of t= his > search should be crawled . Is it possibe to do in nutch , just providing = the > search engine url along with the keyword for search. > > > > Shanthoosh >