nutch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lewis John Mcgibbney <lewis.mcgibb...@gmail.com>
Subject Re: "URLFilterChecker" documentation
Date Fri, 09 Dec 2011 15:11:18 GMT
Hi Remi & Markus,

Yeah, I can replicate this, good catch Remi.

lewis@lewis-desktop:~/ASF/trunk/runtime/local$ bin/nutch
org.apache.nutch.net.URLFilterChecker
http://www.heraldscotland.com-filterName regex-urlfilter.txt
Checking combination of all URLFilters available
^Z
[2]+  Stopped                 bin/nutch
org.apache.nutch.net.URLFilterChecker
http://www.heraldscotland.com-filterName regex-urlfilter.txt
lewis@lewis-desktop:~/ASF/trunk/runtime/local$ bin/nutch
org.apache.nutch.net.URLFilterChecker
http://www.heraldscotland.com-filterName regex-urlfilter
Checking combination of all URLFilters available

The first instance was hanging, so was the second. This needs some further
investigation I think. Can someone else please confirm before we log this
in Jira?

Thanks for reporting

On Fri, Dec 9, 2011 at 12:53 PM, remi tassing <tassingremi@gmail.com> wrote:

> I fed with URL but it didn't work:
>
> $ bin/nutch org.apache.nutch.net.URLFilterChecker http://www.google.com
> Checking combination of all URLFilters available
>
> Remi
>
> On Fri, Dec 9, 2011 at 2:43 PM, Markus Jelsma <markus.jelsma@openindex.io
> >wrote:
>
> > it reads from stdin so you can either type a url followed by enter or
> feed
> > from stdin using pipes.
> >
> > On Friday 09 December 2011 13:32:41 remi tassing wrote:
> > > Hello guys,
> > >
> > > how do you use "org.apache.nutch.net.URLFilterChecker"? It's not
> > documented
> > > and it always shows me this "Checking combination of all URLFilters
> > > available" and then gets stuck.
> > >
> > > Remi
> >
> > --
> > Markus Jelsma - CTO - Openindex
> >
>
>
>
> --
> Remi Tassing
>



-- 
*Lewis*

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message