Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 98513 invoked from network); 16 Jan 2010 07:27:00 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 16 Jan 2010 07:27:00 -0000 Received: (qmail 50660 invoked by uid 500); 16 Jan 2010 07:26:58 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 50581 invoked by uid 500); 16 Jan 2010 07:26:58 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 50571 invoked by uid 99); 16 Jan 2010 07:26:58 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 16 Jan 2010 07:26:58 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy) Received: from [206.190.49.12] (HELO web52902.mail.re2.yahoo.com) (206.190.49.12) by apache.org (qpsmtpd/0.29) with SMTP; Sat, 16 Jan 2010 07:26:48 +0000 Received: (qmail 57465 invoked by uid 60001); 16 Jan 2010 07:26:27 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1263626787; bh=Rib2YJ19VBjB79lNjX+wUAMoH2sby4xuW6hbWrW3ySo=; h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Subject:To:In-Reply-To:MIME-Version:Content-Type; b=HwfsyqhcgIv536ZYLv5ByBoyfmNGIF9HSUoqZ3x8l9i8Us5JCxoWS4nKsqPLoWxLw8S/n/ge9ktjgujcDpH/b1tV8e3rFQKmpVEGycqlWo/n3hAurq1A1Zzarjr0JIeJCvcclkk5eYLjI4bNjUtg3D7fNDtcMXQLxWIfoBMWfz4= DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Subject:To:In-Reply-To:MIME-Version:Content-Type; b=wAjZk/hKZ2ir+522nE7CEm27jWCEhI3uQc9kQ/OVZRyirgFXsHXWWUifX8UNV1Hv8zFaTQJd0SBJaqrOfdwlbjqMux0kJHaNenZ4YBHsAnjgkBSTX5N5v7wFUst4egdZr1S4LKmF2yWhkpapMvACshvQGhsGPganxkip834jBr8=; Message-ID: <43867.57165.qm@web52902.mail.re2.yahoo.com> X-YMail-OSG: vY8_4iAVM1lEgv9APSQSozoBNOMRrfive2dMemZrcnzXl8nC8z7E6mBlNn80lznv9zcxavmPf1L5w.qjaNUWnQqmwM83tY_5lmXpd17AonBNPnaVqXURVbjaad1a8vvMJB_ovZkGJ9kokYl0S12Pkpo_HJ8.GxR4W073VQNF6STjsgwm84SH0sem5gBj8FShd.npswBo9HYL88AaTEq_YhjCt8ALRhp.OMa._egB345HmlwnYRy2oeyG9ua8si3.irES1z537Mul2DgyRXPO9JdpPq8QxmhR6prwVvuS5zkOyjSyfIkXYWlXWTrxzwEFNMk538AzFzsM97gQhpqJaJYOES9LTCwozRKd4z7N2K5xGnapRCSlKjiB9FqQuCBQpsmqzsJdMpjvCN14rRFz0XyPXAPKY_j5y6Nwoc52TMRhJ9rdac34fFRRxNka.xYMoCPny8igmpRJhzzNVDulhYBH Received: from [95.15.159.50] by web52902.mail.re2.yahoo.com via HTTP; Fri, 15 Jan 2010 23:26:26 PST X-Mailer: YahooMailClassic/9.0.20 YahooMailWebService/0.8.100.260964 Date: Fri, 15 Jan 2010 23:26:26 -0800 (PST) From: Ahmet Arslan Subject: Re: A way to download URLs and index better ? To: java-user@lucene.apache.org In-Reply-To: <3b23ce091001152320k661a52bcr2a1381cf49eecc67@mail.gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii > Hi everyone, please help me this > question: > I need downloading some webpages from a list of URLs (about > 200 links) and > then index them by Lucene. > This list is not fixed, because it depends on definition of > my process. > Currently, in my web application, I wrote class for > downloading, but it > download time is too long. > > Please recommend me a Java library suitable with my > situation for optimize > downloading. > More its examples are very wonderful (INPUT: list of URLs; > OUTPUT: webpages > content, or indexed repository) > Thank you very much. Probably most famous ones : http://lucene.apache.org/nutch/ http://crawler.archive.org/ --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org