nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Schneider (JIRA)" <>
Subject [jira] Created: (NUTCH-252) Launching a segread/readdb command kills any running nutch commands
Date Fri, 21 Apr 2006 23:31:05 GMT
Launching a segread/readdb command kills any running nutch commands

         Key: NUTCH-252
     Project: Nutch
        Type: Bug

    Versions: 0.8-dev    
 Environment: multi-box installation using DFS (1 jobtracker/namenode master, 10 tasktracker/datanode
    Reporter: Chris Schneider
    Priority: Minor

I use a simple script to conduct a whole-web crawl (generate, fetch, updatedb, and repeat
until target depth reached). While this is running, I monitor the progress via the jobtracker's
browser-based UI. Sometimes there's a fairly long pause after one mapreduce job completes
and the next one gets launched, so I mistakenly assume that depth has been reached. I then
launch a segread -list or readdb -stats command to summarize the results. Doing so apparently
kills any active jobs with absolutely no warning in any of the logs, the console output, or
the jobtracker's UI. The jobs just stop writing to the logs and any child processes disappear.
Usually, the jobtracker and tasktrackers remain up and respond to subsequent commands.

This message is automatically generated by JIRA.
If you think it was sent incorrectly contact one of the administrators:
For more information on JIRA, see:

View raw message