nutch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From zo tiger <zo.ti...@hotmail.com>
Subject Re: Help me, No urls to fetch.
Date Mon, 07 Sep 2009 04:27:23 GMT

This is my hadoop.log file's contents


2009-09-07 03:32:58,137 INFO  plugin.PluginRepository -         HTTP
Framework (lib-http)
2009-09-07 03:32:58,137 INFO  plugin.PluginRepository -         Text Parse
Plug-in (parse-text)
2009-09-07 03:32:58,137 INFO  plugin.PluginRepository -         Pass-through
URL Normalizer (urlnormalizer-pass)
2009-09-07 03:32:58,137 INFO  plugin.PluginRepository -         Regex URL
Filter (urlfilter-regex)
2009-09-07 03:32:58,137 INFO  plugin.PluginRepository -         Http
Protocol Plug-in (protocol-http)
2009-09-07 03:32:58,137 INFO  plugin.PluginRepository -         XML Response
Writer Plug-in (response-xml)
2009-09-07 03:32:58,137 INFO  plugin.PluginRepository -         Regex URL
Normalizer (urlnormalizer-regex)
2009-09-07 03:32:58,137 INFO  plugin.PluginRepository -         OPIC Scoring
Plug-in (scoring-opic)
2009-09-07 03:32:58,137 INFO  plugin.PluginRepository -         CyberNeko
HTML Parser (lib-nekohtml)
2009-09-07 03:32:58,137 INFO  plugin.PluginRepository -         Anchor
Indexing Filter (index-anchor)
2009-09-07 03:32:58,137 INFO  plugin.PluginRepository -         JavaScript
Parser (parse-js)
2009-09-07 03:32:58,137 INFO  plugin.PluginRepository -         URL Query
Filter (query-url)
2009-09-07 03:32:58,137 INFO  plugin.PluginRepository -         Regex URL
Filter Framework (lib-regex-filter)
2009-09-07 03:32:58,137 INFO  plugin.PluginRepository -         JSON
Response Writer Plug-in (response-json)
2009-09-07 03:32:58,137 INFO  plugin.PluginRepository - Registered
Extension-Points:
2009-09-07 03:32:58,137 INFO  plugin.PluginRepository -         Nutch
Summarizer (org.apache.nutch.searcher.Summarizer)
2009-09-07 03:32:58,137 INFO  plugin.PluginRepository -         Nutch
Protocol (org.apache.nutch.protocol.Protocol)
2009-09-07 03:32:58,137 INFO  plugin.PluginRepository -         Nutch
Analysis (org.apache.nutch.analysis.NutchAnalyzer)
2009-09-07 03:32:58,137 INFO  plugin.PluginRepository -         Nutch Field
Filter (org.apache.nutch.indexer.field.FieldFilter)
2009-09-07 03:32:58,138 INFO  plugin.PluginRepository -         HTML Parse
Filter (org.apache.nutch.parse.HtmlParseFilter)
2009-09-07 03:32:58,138 INFO  plugin.PluginRepository -         Nutch Query
Filter (org.apache.nutch.searcher.QueryFilter)
2009-09-07 03:32:58,138 INFO  plugin.PluginRepository -         Nutch Search
Results Response Writer (org.apache.nutch.searcher.response.ResponseWriter)


MilleBii wrote:
> 
> Is there more information in logs/hadoop file ?
> 
> What is your plug-in list ?
> 
> 2009/9/2 zo tiger <zo.tiger@hotmail.com>
> 
>>
>> Thank you for your reply.
>>
>> In urls directory(exactly /nutch/search/urls) , there is a file
>> urllist.txt.
>>
>> content is as following.
>>
>>      http://lucene.apache.org
>>
>> I don't understand why nutch can not fetch any url.
>>
>>
>> Paul Tomblin wrote:
>> >
>> > On Wed, Sep 2, 2009 at 6:36 AM, zo tiger<zo.tiger@hotmail.com> wrote:
>> >>
>> >
>> >> At last i ran bin/nutch crawl command but it gives
>> >>
>> >> No urls to fetch check your filter and seed list error
>> >>
>> >> I am sure there is no problem in crawl-url filter and other
>> configuration
>> >> xml files
>> >>
>> >> İs anyone know any possible problem????
>> >>
>> >
>> > What's in your url directory?
>> >
>> >
>> > --
>> > http://www.linkedin.com/in/paultomblin
>> >
>> >
>>
>> --
>> View this message in context:
>> http://www.nabble.com/Help-me%2C-No-urls-to-fetch.-tp25255142p25255944.html
>> Sent from the Nutch - User mailing list archive at Nabble.com.
>>
>>
> 
> 
> -- 
> -MilleBii-
> 
> 

-- 
View this message in context: http://www.nabble.com/Help-me%2C-No-urls-to-fetch.-tp25255142p25324884.html
Sent from the Nutch - User mailing list archive at Nabble.com.


Mime
View raw message