Return-Path: X-Original-To: apmail-lucene-solr-user-archive@minotaur.apache.org Delivered-To: apmail-lucene-solr-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 0B43DE216 for ; Mon, 11 Feb 2013 00:27:55 +0000 (UTC) Received: (qmail 62316 invoked by uid 500); 11 Feb 2013 00:27:51 -0000 Delivered-To: apmail-lucene-solr-user-archive@lucene.apache.org Received: (qmail 62250 invoked by uid 500); 11 Feb 2013 00:27:51 -0000 Mailing-List: contact solr-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: solr-user@lucene.apache.org Delivered-To: mailing list solr-user@lucene.apache.org Received: (qmail 62242 invoked by uid 99); 11 Feb 2013 00:27:51 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 11 Feb 2013 00:27:51 +0000 X-ASF-Spam-Status: No, hits=2.8 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,URI_HEX X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of erickerickson@gmail.com designates 209.85.216.178 as permitted sender) Received: from [209.85.216.178] (HELO mail-qc0-f178.google.com) (209.85.216.178) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 11 Feb 2013 00:27:45 +0000 Received: by mail-qc0-f178.google.com with SMTP id j34so2023444qco.23 for ; Sun, 10 Feb 2013 16:27:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:content-type; bh=DfeNFfAiiKQjJ7HBbVpw37+B3VF2ZPhz1cVQAOueggk=; b=E+AwHM2SLYDkvoNFhHFCkMoutI1NS7XgaINqhbr51YRIbXsPqosr5LcpUJTHuvBf21 lnXIMHQyqGwIEwn8OgqY3D2HCCqfqXauhcLTiwejH2xvgYTjd2w/X4xD0vkdaVLujrQH ShVPjwV8DAaRlZF5o0/g7wabjFt8D/Qn3z2X6ePo7duEka8S+Nzkm+0FAEdpBDJVxXcF TWvT1Yn9RGhPCDrHN3xZ8u0bsc7/E+b7gC2liV5KUvk2Bn9n7/OxuZ2oosUFM2kMl0Lh VkGTN4JNDbJiBm7QcbfJQCogHxjE/a26ihUN6JED7/n1dNl/m3HYis9+8EEEzpilE9eT C9cA== MIME-Version: 1.0 X-Received: by 10.224.78.69 with SMTP id j5mr4398824qak.95.1360542444262; Sun, 10 Feb 2013 16:27:24 -0800 (PST) Received: by 10.49.12.240 with HTTP; Sun, 10 Feb 2013 16:27:24 -0800 (PST) In-Reply-To: <1360301109917-4039172.post@n3.nabble.com> References: <4D6D8E64.8080800@eolya.fr> <1360301109917-4039172.post@n3.nabble.com> Date: Sun, 10 Feb 2013 19:27:24 -0500 Message-ID: Subject: Re: Crawl Anywhere - From: Erick Erickson To: solr-user@lucene.apache.org Content-Type: multipart/alternative; boundary=20cf3074b24617960804d567f98a X-Virus-Checked: Checked by ClamAV on apache.org --20cf3074b24617960804d567f98a Content-Type: text/plain; charset=ISO-8859-1 Have you looked at nutch? On Fri, Feb 8, 2013 at 12:25 AM, SivaKarthik wrote: > Hi All, > in our project, we need to download around millions of pages... > so is there any support to do the crawling in distributed environment > using > crawl-anywhere apps? > or wat could be the alternatives...? > > Thanks in advance.. > > > > > > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/ANNOUNCE-Web-Crawler-tp2607831p4039172.html > Sent from the Solr - User mailing list archive at Nabble.com. > --20cf3074b24617960804d567f98a--