nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zhang JinYan (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (NUTCH-1138) remove LogUtil from trunk and nutch gora
Date Tue, 01 Nov 2011 17:11:33 GMT

    [ https://issues.apache.org/jira/browse/NUTCH-1138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13141334#comment-13141334
] 

Zhang JinYan commented on NUTCH-1138:
-------------------------------------

Apply the path to branch-1.4, rebuild with cmd: "ant clean build".
Config to crawl websites:
{quote}
http://172.16.123.123/bbs/viewthread.php?tid=12345
http://172.16.123.123/bbs/attachment.php?aid=12345
http://www.jettycn.com/
{quote}

The previous two sites are not available.
Run crawl with cmd(platform windows):
{quote}
sh.exe ./bin/nutch crawl seedurl -dir crawldev -solr http://localhost:8983/solr/
{quote}

Complete the crawl successfully.Query int solr admin return:
{code:xml}
<result name="response" numFound="320" start="0"></result>
{code}

Check the hadoop.log, search word "ERROR",find 3 results caused by:
{code}
java.net.ConnectException: Connection timed out: connect
{code}

Search word "Exception", find results like this:
{quote}
2011-11-02 00:39:01,821 INFO  httpclient.HttpMethodDirector - I/O exception (org.apache.commons.httpclient.NoHttpResponseException)
caught when processing request: The server www.jettycn.com failed to respond
2011-11-02 00:39:01,821 INFO  httpclient.HttpMethodDirector - Retrying request
{quote}

So there is no exception related your path in the "hadoop.log".
The path work fine with "branch-1.4" for me.
                
> remove LogUtil from trunk and nutch gora
> ----------------------------------------
>
>                 Key: NUTCH-1138
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1138
>             Project: Nutch
>          Issue Type: Improvement
>    Affects Versions: 1.4, nutchgora
>            Reporter: Lewis John McGibbney
>            Assignee: Lewis John McGibbney
>            Priority: Minor
>             Fix For: nutchgora, 1.5
>
>         Attachments: Document1.txt, NUTCH-1138-trunk-20111023.patch
>
>
> This should move towards the removal of the LogUtil class from both codebases as per
comments in NUTCH-1078.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message