nutch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Hannes Carl Meyer <hannesc...@googlemail.com>
Subject Re: Can I custom crawl using Nutch?
Date Wed, 04 May 2011 15:37:38 GMT
Hi,

I would rather use the wikipedia dumps!

You should have a look at jwpl http://code.google.com/p/jwpl/

BR

Hannes

On Wed, May 4, 2011 at 5:20 PM, Kelvin <ksxh@yahoo.com.sg> wrote:

> Hello,
>
> I would like to crawl wikipedia using Nutch, but as it is too large, I
> would only like to crawl pages that are related to a particular subject.
>
> For example, I would like to crawl for webpages of wikipedia that contain
> the term "Football". Is this possible using Nutch?
>
> Thank you for your kind help.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message