mahout-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Eastman <jeast...@Narus.com>
Subject RE: About GBDT support
Date Wed, 23 Mar 2011 17:12:04 GMT
The console logs are not helpful. Need to see the command line arguments you are using. In
particular, did you use the -cl option when running kmeans?

-----Original Message-----
From: vishnu krishnan [mailto:vgrkrishnan@gmail.com]
Sent: Wednesday, March 23, 2011 10:02 AM
To: user@mahout.apache.org
Cc: Ted Dunning; Bai, Gang
Subject: Re: About GBDT support

*we run kmeans news clustering with 53 news articles. we took the
newsarticle and article ID from database, and we get the output like this.
what is it meant by

0 belongs to cluster 1.0: [].
is there only one cluster and where is the article ID we had appended?
how can we know which article belongs to which cluster.kindely help me to
rectify this problem?

*



OUTPUT

init:
deps-module-jar:
deps-ear-jar:
deps-jar:
compile-single:
run-main:
Mar 23, 2011 3:02:38 PM org.slf4j.impl.JCLLoggerAdapter info
INFO: Deleting newsClusters
Mar 23, 2011 3:02:38 PM org.apache.hadoop.metrics.jvm.JvmMetrics init
INFO: Initializing JVM Metrics with processName=JobTracker, sessionId=
Mar 23, 2011 3:02:38 PM org.apache.hadoop.mapred.JobClient
configureCommandLineOptions
WARNING: Use GenericOptionsParser for parsing the arguments. Applications
should implement Tool for the same.
Mar 23, 2011 3:02:39 PM
org.apache.hadoop.mapreduce.lib.input.FileInputFormat listStatus
INFO: Total input paths to process : 1
Mar 23, 2011 3:02:39 PM org.apache.hadoop.mapred.JobClient
monitorAndPrintJob
INFO: Running job: job_local_0001
Mar 23, 2011 3:02:39 PM
org.apache.hadoop.mapreduce.lib.input.FileInputFormat listStatus
INFO: Total input paths to process : 1
Mar 23, 2011 3:02:39 PM org.apache.hadoop.mapred.Task done
INFO: Task:attempt_local_0001_m_000000_0 is done. And is in the process of
commiting
Mar 23, 2011 3:02:39 PM org.apache.hadoop.mapred.LocalJobRunner$Job
statusUpdate
INFO:
Mar 23, 2011 3:02:39 PM org.apache.hadoop.mapred.Task commit
INFO: Task attempt_local_0001_m_000000_0 is allowed to commit now
Mar 23, 2011 3:02:39 PM
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter commitTask
INFO: Saved output of task 'attempt_local_0001_m_000000_0' to
newsClusters/tokenized-documents
Mar 23, 2011 3:02:39 PM org.apache.hadoop.mapred.LocalJobRunner$Job
statusUpdate
INFO:
Mar 23, 2011 3:02:39 PM org.apache.hadoop.mapred.Task sendDone
INFO: Task 'attempt_local_0001_m_000000_0' done.
Mar 23, 2011 3:02:40 PM org.apache.hadoop.mapred.JobClient
monitorAndPrintJob
INFO: map 100% reduce 0%
Mar 23, 2011 3:02:40 PM org.apache.hadoop.mapred.JobClient
monitorAndPrintJob
INFO: Job complete: job_local_0001
Mar 23, 2011 3:02:40 PM org.apache.hadoop.mapred.Counters log
INFO: Counters: 5
Mar 23, 2011 3:02:40 PM org.apache.hadoop.mapred.Counters log
INFO: FileSystemCounters
Mar 23, 2011 3:02:40 PM org.apache.hadoop.mapred.Counters log
INFO: FILE_BYTES_READ=8889540
Mar 23, 2011 3:02:40 PM org.apache.hadoop.mapred.Counters log
INFO: FILE_BYTES_WRITTEN=9063087
Mar 23, 2011 3:02:40 PM org.apache.hadoop.mapred.Counters log
INFO: Map-Reduce Framework
Mar 23, 2011 3:02:40 PM org.apache.hadoop.mapred.Counters log
INFO: Map input records=53
Mar 23, 2011 3:02:40 PM org.apache.hadoop.mapred.Counters log
INFO: Spilled Records=0
Mar 23, 2011 3:02:40 PM org.apache.hadoop.mapred.Counters log
INFO: Map output records=53
Mar 23, 2011 3:02:40 PM org.apache.hadoop.metrics.jvm.JvmMetrics init
INFO: Cannot initialize JVM Metrics with processName=JobTracker, sessionId=
- already initialized
Mar 23, 2011 3:02:40 PM org.apache.hadoop.mapred.JobClient
configureCommandLineOptions
WARNING: Use GenericOptionsParser for parsing the arguments. Applications
should implement Tool for the same.
Mar 23, 2011 3:02:40 PM
org.apache.hadoop.mapreduce.lib.input.FileInputFormat listStatus
INFO: Total input paths to process : 1
Mar 23, 2011 3:02:40 PM org.apache.hadoop.mapred.JobClient
monitorAndPrintJob
INFO: Running job: job_local_0002
Mar 23, 2011 3:02:40 PM
org.apache.hadoop.mapreduce.lib.input.FileInputFormat listStatus
INFO: Total input paths to process : 1
Mar 23, 2011 3:02:40 PM org.apache.hadoop.mapred.MapTask$MapOutputBuffer
<init>
INFO: io.sort.mb = 100
Mar 23, 2011 3:02:40 PM org.apache.hadoop.mapred.MapTask$MapOutputBuffer
<init>
INFO: data buffer = 79691776/99614720
Mar 23, 2011 3:02:40 PM org.apache.hadoop.mapred.MapTask$MapOutputBuffer
<init>
INFO: record buffer = 262144/327680
Mar 23, 2011 3:02:40 PM org.slf4j.impl.JCLLoggerAdapter info
INFO: Max Ngram size is 2
Mar 23, 2011 3:02:40 PM org.slf4j.impl.JCLLoggerAdapter info
INFO: Emit Unitgrams is true
Mar 23, 2011 3:02:41 PM org.apache.hadoop.mapred.MapTask$MapOutputBuffer
flush
INFO: Starting flush of map output
Mar 23, 2011 3:02:41 PM org.apache.hadoop.mapred.JobClient
monitorAndPrintJob
INFO: map 0% reduce 0%
Mar 23, 2011 3:02:41 PM org.apache.hadoop.mapred.MapTask$MapOutputBuffer
sortAndSpill
INFO: Finished spill 0
Mar 23, 2011 3:02:41 PM org.apache.hadoop.mapred.Task done
INFO: Task:attempt_local_0002_m_000000_0 is done. And is in the process of
commiting
Mar 23, 2011 3:02:41 PM org.apache.hadoop.mapred.LocalJobRunner$Job
statusUpdate
INFO:
Mar 23, 2011 3:02:41 PM org.apache.hadoop.mapred.Task sendDone
INFO: Task 'attempt_local_0002_m_000000_0' done.
Mar 23, 2011 3:02:41 PM org.apache.hadoop.mapred.LocalJobRunner$Job
statusUpdate
INFO:
Mar 23, 2011 3:02:41 PM org.apache.hadoop.mapred.Merger$MergeQueue merge
INFO: Merging 1 sorted segments
Mar 23, 2011 3:02:41 PM org.apache.hadoop.mapred.Merger$MergeQueue merge
INFO: Down to the last merge-pass, with 1 segments left of total size:
1104153 bytes
Mar 23, 2011 3:02:41 PM org.apache.hadoop.mapred.LocalJobRunner$Job
statusUpdate
INFO:
Mar 23, 2011 3:02:41 PM org.slf4j.impl.JCLLoggerAdapter info
INFO: Min support is 5
Mar 23, 2011 3:02:41 PM org.slf4j.impl.JCLLoggerAdapter info
INFO: Emit Unitgrams is true
Mar 23, 2011 3:02:42 PM org.apache.hadoop.mapred.Task done
INFO: Task:attempt_local_0002_r_000000_0 is done. And is in the process of
commiting
Mar 23, 2011 3:02:42 PM org.apache.hadoop.mapred.LocalJobRunner$Job
statusUpdate
INFO:
Mar 23, 2011 3:02:42 PM org.apache.hadoop.mapred.Task commit
INFO: Task attempt_local_0002_r_000000_0 is allowed to commit now
Mar 23, 2011 3:02:42 PM
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter commitTask
INFO: Saved output of task 'attempt_local_0002_r_000000_0' to
newsClusters/wordcount/subgrams
Mar 23, 2011 3:02:42 PM org.apache.hadoop.mapred.LocalJobRunner$Job
statusUpdate
INFO: reduce > reduce
Mar 23, 2011 3:02:42 PM org.apache.hadoop.mapred.Task sendDone
INFO: Task 'attempt_local_0002_r_000000_0' done.
Mar 23, 2011 3:02:42 PM org.apache.hadoop.mapred.JobClient
monitorAndPrintJob
INFO: map 100% reduce 100%
Mar 23, 2011 3:02:42 PM org.apache.hadoop.mapred.JobClient
monitorAndPrintJob
INFO: Job complete: job_local_0002
Mar 23, 2011 3:02:42 PM org.apache.hadoop.mapred.Counters log
INFO: Counters: 14
Mar 23, 2011 3:02:42 PM org.apache.hadoop.mapred.Counters log
INFO: org.apache.mahout.vectorizer.collocations.llr.CollocMapper$Count
Mar 23, 2011 3:02:42 PM org.apache.hadoop.mapred.Counters log
INFO: NGRAM_TOTAL=12244
Mar 23, 2011 3:02:42 PM org.apache.hadoop.mapred.Counters log
INFO: org.apache.mahout.vectorizer.collocations.llr.CollocReducer$Skipped
Mar 23, 2011 3:02:42 PM org.apache.hadoop.mapred.Counters log
INFO: LESS_THAN_MIN_SUPPORT=22811
Mar 23, 2011 3:02:42 PM org.apache.hadoop.mapred.Counters log
INFO: FileSystemCounters
Mar 23, 2011 3:02:42 PM org.apache.hadoop.mapred.Counters log
INFO: FILE_BYTES_READ=36602035
Mar 23, 2011 3:02:42 PM org.apache.hadoop.mapred.Counters log
INFO: FILE_BYTES_WRITTEN=38059703
Mar 23, 2011 3:02:42 PM org.apache.hadoop.mapred.Counters log
INFO: Map-Reduce Framework
Mar 23, 2011 3:02:42 PM org.apache.hadoop.mapred.Counters log
INFO: Reduce input groups=10348
Mar 23, 2011 3:02:42 PM org.apache.hadoop.mapred.Counters log
INFO: Combine output records=30502
Mar 23, 2011 3:02:42 PM org.apache.hadoop.mapred.Counters log
INFO: Map input records=53
Mar 23, 2011 3:02:42 PM org.apache.hadoop.mapred.Counters log
INFO: Reduce shuffle bytes=0
Mar 23, 2011 3:02:42 PM org.apache.hadoop.mapred.Counters log
INFO: Reduce output records=795
Mar 23, 2011 3:02:42 PM org.apache.hadoop.mapred.Counters log
INFO: Spilled Records=61004
Mar 23, 2011 3:02:42 PM org.apache.hadoop.mapred.Counters log
INFO: Map output bytes=1512720
Mar 23, 2011 3:02:42 PM org.apache.hadoop.mapred.Counters log
INFO: Combine input records=53681
Mar 23, 2011 3:02:42 PM org.apache.hadoop.mapred.Counters log
INFO: Map output records=53681
Mar 23, 2011 3:02:42 PM org.apache.hadoop.mapred.Counters log
INFO: Reduce input records=30502
Mar 23, 2011 3:02:42 PM org.apache.hadoop.metrics.jvm.JvmMetrics init
INFO: Cannot initialize JVM Metrics with processName=JobTracker, sessionId=
- already initialized
Mar 23, 2011 3:02:42 PM org.apache.hadoop.mapred.JobClient
configureCommandLineOptions
WARNING: Use GenericOptionsParser for parsing the arguments. Applications
should implement Tool for the same.
Mar 23, 2011 3:02:42 PM
org.apache.hadoop.mapreduce.lib.input.FileInputFormat listStatus
INFO: Total input paths to process : 1
Mar 23, 2011 3:02:42 PM org.apache.hadoop.mapred.JobClient
monitorAndPrintJob
INFO: Running job: job_local_0003
Mar 23, 2011 3:02:42 PM
org.apache.hadoop.mapreduce.lib.input.FileInputFormat listStatus
INFO: Total input paths to process : 1
Mar 23, 2011 3:02:42 PM org.apache.hadoop.mapred.MapTask$MapOutputBuffer
<init>
INFO: io.sort.mb = 100
Mar 23, 2011 3:02:42 PM org.apache.hadoop.mapred.MapTask$MapOutputBuffer
<init>
INFO: data buffer = 79691776/99614720
Mar 23, 2011 3:02:42 PM org.apache.hadoop.mapred.MapTask$MapOutputBuffer
<init>
INFO: record buffer = 262144/327680
Mar 23, 2011 3:02:42 PM org.apache.hadoop.mapred.MapTask$MapOutputBuffer
flush
INFO: Starting flush of map output
Mar 23, 2011 3:02:42 PM org.apache.hadoop.mapred.MapTask$MapOutputBuffer
sortAndSpill
INFO: Finished spill 0
Mar 23, 2011 3:02:42 PM org.apache.hadoop.mapred.Task done
INFO: Task:attempt_local_0003_m_000000_0 is done. And is in the process of
commiting
Mar 23, 2011 3:02:42 PM org.apache.hadoop.mapred.LocalJobRunner$Job
statusUpdate
INFO:
Mar 23, 2011 3:02:42 PM org.apache.hadoop.mapred.Task sendDone
INFO: Task 'attempt_local_0003_m_000000_0' done.
Mar 23, 2011 3:02:42 PM org.apache.hadoop.mapred.LocalJobRunner$Job
statusUpdate
INFO:
Mar 23, 2011 3:02:42 PM org.apache.hadoop.mapred.Merger$MergeQueue merge
INFO: Merging 1 sorted segments
Mar 23, 2011 3:02:42 PM org.apache.hadoop.mapred.Merger$MergeQueue merge
INFO: Down to the last merge-pass, with 1 segments left of total size: 17039
bytes
Mar 23, 2011 3:02:42 PM org.apache.hadoop.mapred.LocalJobRunner$Job
statusUpdate
INFO:
Mar 23, 2011 3:02:42 PM org.slf4j.impl.JCLLoggerAdapter info
INFO: NGram Total is 12244
Mar 23, 2011 3:02:42 PM org.slf4j.impl.JCLLoggerAdapter info
INFO: Min LLR value is 1.0
Mar 23, 2011 3:02:42 PM org.slf4j.impl.JCLLoggerAdapter info
INFO: Emit Unitgrams is true
Mar 23, 2011 3:02:42 PM org.apache.hadoop.mapred.Task done
INFO: Task:attempt_local_0003_r_000000_0 is done. And is in the process of
commiting
Mar 23, 2011 3:02:42 PM org.apache.hadoop.mapred.LocalJobRunner$Job
statusUpdate
INFO:
Mar 23, 2011 3:02:42 PM org.apache.hadoop.mapred.Task commit
INFO: Task attempt_local_0003_r_000000_0 is allowed to commit now
Mar 23, 2011 3:02:42 PM
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter commitTask
INFO: Saved output of task 'attempt_local_0003_r_000000_0' to
newsClusters/wordcount/ngrams
Mar 23, 2011 3:02:42 PM org.apache.hadoop.mapred.LocalJobRunner$Job
statusUpdate
INFO: reduce > reduce
Mar 23, 2011 3:02:42 PM org.apache.hadoop.mapred.Task sendDone
INFO: Task 'attempt_local_0003_r_000000_0' done.
Mar 23, 2011 3:02:43 PM org.apache.hadoop.mapred.JobClient
monitorAndPrintJob
INFO: map 100% reduce 100%
Mar 23, 2011 3:02:43 PM org.apache.hadoop.mapred.JobClient
monitorAndPrintJob
INFO: Job complete: job_local_0003
Mar 23, 2011 3:02:43 PM org.apache.hadoop.mapred.Counters log
INFO: Counters: 12
Mar 23, 2011 3:02:43 PM org.apache.hadoop.mapred.Counters log
INFO: FileSystemCounters
Mar 23, 2011 3:02:43 PM org.apache.hadoop.mapred.Counters log
INFO: FILE_BYTES_READ=55302923
Mar 23, 2011 3:02:43 PM org.apache.hadoop.mapred.Counters log
INFO: FILE_BYTES_WRITTEN=55833922
Mar 23, 2011 3:02:43 PM org.apache.hadoop.mapred.Counters log
INFO: Map-Reduce Framework
Mar 23, 2011 3:02:43 PM org.apache.hadoop.mapred.Counters log
INFO: Reduce input groups=707
Mar 23, 2011 3:02:43 PM org.apache.hadoop.mapred.Counters log
INFO: Combine output records=0
Mar 23, 2011 3:02:43 PM org.apache.hadoop.mapred.Counters log
INFO: Map input records=795
Mar 23, 2011 3:02:43 PM org.apache.hadoop.mapred.Counters log
INFO: Reduce shuffle bytes=0
Mar 23, 2011 3:02:43 PM org.apache.hadoop.mapred.Counters log
INFO: Reduce output records=707
Mar 23, 2011 3:02:43 PM org.apache.hadoop.mapred.Counters log
INFO: Spilled Records=1590
Mar 23, 2011 3:02:43 PM org.apache.hadoop.mapred.Counters log
INFO: Map output bytes=15447
Mar 23, 2011 3:02:43 PM org.apache.hadoop.mapred.Counters log
INFO: Combine input records=0
Mar 23, 2011 3:02:43 PM org.apache.hadoop.mapred.Counters log
INFO: Map output records=795
Mar 23, 2011 3:02:43 PM org.apache.hadoop.mapred.Counters log
INFO: Reduce input records=795
Mar 23, 2011 3:02:43 PM org.apache.hadoop.metrics.jvm.JvmMetrics init
INFO: Cannot initialize JVM Metrics with processName=JobTracker, sessionId=
- already initialized
Mar 23, 2011 3:02:43 PM org.apache.hadoop.mapred.JobClient
configureCommandLineOptions
WARNING: Use GenericOptionsParser for parsing the arguments. Applications
should implement Tool for the same.
Mar 23, 2011 3:02:43 PM
org.apache.hadoop.mapreduce.lib.input.FileInputFormat listStatus
INFO: Total input paths to process : 1
Mar 23, 2011 3:02:43 PM org.apache.hadoop.mapred.JobClient
monitorAndPrintJob
INFO: Running job: job_local_0004
Mar 23, 2011 3:02:43 PM
org.apache.hadoop.mapreduce.lib.input.FileInputFormat listStatus
INFO: Total input paths to process : 1
Mar 23, 2011 3:02:43 PM org.apache.hadoop.mapred.MapTask$MapOutputBuffer
<init>
INFO: io.sort.mb = 100
Mar 23, 2011 3:02:44 PM org.apache.hadoop.mapred.MapTask$MapOutputBuffer
<init>
INFO: data buffer = 79691776/99614720
Mar 23, 2011 3:02:44 PM org.apache.hadoop.mapred.MapTask$MapOutputBuffer
<init>
INFO: record buffer = 262144/327680
Mar 23, 2011 3:02:44 PM org.apache.hadoop.mapred.MapTask$MapOutputBuffer
flush
INFO: Starting flush of map output
Mar 23, 2011 3:02:44 PM org.apache.hadoop.mapred.MapTask$MapOutputBuffer
sortAndSpill
INFO: Finished spill 0
Mar 23, 2011 3:02:44 PM org.apache.hadoop.mapred.Task done
INFO: Task:attempt_local_0004_m_000000_0 is done. And is in the process of
commiting
Mar 23, 2011 3:02:44 PM org.apache.hadoop.mapred.LocalJobRunner$Job
statusUpdate
INFO:
Mar 23, 2011 3:02:44 PM org.apache.hadoop.mapred.Task sendDone
INFO: Task 'attempt_local_0004_m_000000_0' done.
Mar 23, 2011 3:02:44 PM org.apache.hadoop.mapred.LocalJobRunner$Job
statusUpdate
INFO:
Mar 23, 2011 3:02:44 PM org.apache.hadoop.mapred.Merger$MergeQueue merge
INFO: Merging 1 sorted segments
Mar 23, 2011 3:02:44 PM org.apache.hadoop.mapred.Merger$MergeQueue merge
INFO: Down to the last merge-pass, with 1 segments left of total size: 89679
bytes
Mar 23, 2011 3:02:44 PM org.apache.hadoop.mapred.LocalJobRunner$Job
statusUpdate
INFO:
Mar 23, 2011 3:02:44 PM org.apache.hadoop.mapred.Task done
INFO: Task:attempt_local_0004_r_000000_0 is done. And is in the process of
commiting
Mar 23, 2011 3:02:44 PM org.apache.hadoop.mapred.LocalJobRunner$Job
statusUpdate
INFO:
Mar 23, 2011 3:02:44 PM org.apache.hadoop.mapred.Task commit
INFO: Task attempt_local_0004_r_000000_0 is allowed to commit now
Mar 23, 2011 3:02:44 PM
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter commitTask
INFO: Saved output of task 'attempt_local_0004_r_000000_0' to
newsClusters/partial-vectors-0
Mar 23, 2011 3:02:44 PM org.apache.hadoop.mapred.LocalJobRunner$Job
statusUpdate
INFO: reduce > reduce
Mar 23, 2011 3:02:44 PM org.apache.hadoop.mapred.Task sendDone
INFO: Task 'attempt_local_0004_r_000000_0' done.
Mar 23, 2011 3:02:44 PM org.apache.hadoop.mapred.JobClient
monitorAndPrintJob
INFO: map 100% reduce 100%
Mar 23, 2011 3:02:44 PM org.apache.hadoop.mapred.JobClient
monitorAndPrintJob
INFO: Job complete: job_local_0004
Mar 23, 2011 3:02:44 PM org.apache.hadoop.mapred.Counters log
INFO: Counters: 12
Mar 23, 2011 3:02:44 PM org.apache.hadoop.mapred.Counters log
INFO: FileSystemCounters
Mar 23, 2011 3:02:44 PM org.apache.hadoop.mapred.Counters log
INFO: FILE_BYTES_READ=73176996
Mar 23, 2011 3:02:44 PM org.apache.hadoop.mapred.Counters log
INFO: FILE_BYTES_WRITTEN=73805246
Mar 23, 2011 3:02:44 PM org.apache.hadoop.mapred.Counters log
INFO: Map-Reduce Framework
Mar 23, 2011 3:02:44 PM org.apache.hadoop.mapred.Counters log
INFO: Reduce input groups=53
Mar 23, 2011 3:02:44 PM org.apache.hadoop.mapred.Counters log
INFO: Combine output records=0
Mar 23, 2011 3:02:44 PM org.apache.hadoop.mapred.Counters log
INFO: Map input records=53
Mar 23, 2011 3:02:44 PM org.apache.hadoop.mapred.Counters log
INFO: Reduce shuffle bytes=0
Mar 23, 2011 3:02:44 PM org.apache.hadoop.mapred.Counters log
INFO: Reduce output records=53
Mar 23, 2011 3:02:44 PM org.apache.hadoop.mapred.Counters log
INFO: Spilled Records=106
Mar 23, 2011 3:02:44 PM org.apache.hadoop.mapred.Counters log
INFO: Map output bytes=89466
Mar 23, 2011 3:02:44 PM org.apache.hadoop.mapred.Counters log
INFO: Combine input records=0
Mar 23, 2011 3:02:44 PM org.apache.hadoop.mapred.Counters log
INFO: Map output records=53
Mar 23, 2011 3:02:44 PM org.apache.hadoop.mapred.Counters log
INFO: Reduce input records=53
Mar 23, 2011 3:02:44 PM org.apache.hadoop.metrics.jvm.JvmMetrics init
INFO: Cannot initialize JVM Metrics with processName=JobTracker, sessionId=
- already initialized
Mar 23, 2011 3:02:45 PM org.apache.hadoop.mapred.JobClient
configureCommandLineOptions
WARNING: Use GenericOptionsParser for parsing the arguments. Applications
should implement Tool for the same.
Mar 23, 2011 3:02:45 PM
org.apache.hadoop.mapreduce.lib.input.FileInputFormat listStatus
INFO: Total input paths to process : 1
Mar 23, 2011 3:02:45 PM org.apache.hadoop.mapred.JobClient
monitorAndPrintJob
INFO: Running job: job_local_0005
Mar 23, 2011 3:02:45 PM
org.apache.hadoop.mapreduce.lib.input.FileInputFormat listStatus
INFO: Total input paths to process : 1
Mar 23, 2011 3:02:45 PM org.apache.hadoop.mapred.MapTask$MapOutputBuffer
<init>
INFO: io.sort.mb = 100
Mar 23, 2011 3:02:45 PM org.apache.hadoop.mapred.MapTask$MapOutputBuffer
<init>
INFO: data buffer = 79691776/99614720
Mar 23, 2011 3:02:45 PM org.apache.hadoop.mapred.MapTask$MapOutputBuffer
<init>
INFO: record buffer = 262144/327680
Mar 23, 2011 3:02:45 PM org.apache.hadoop.mapred.MapTask$MapOutputBuffer
flush
INFO: Starting flush of map output
Mar 23, 2011 3:02:45 PM org.apache.hadoop.mapred.MapTask$MapOutputBuffer
sortAndSpill
INFO: Finished spill 0
Mar 23, 2011 3:02:45 PM org.apache.hadoop.mapred.Task done
INFO: Task:attempt_local_0005_m_000000_0 is done. And is in the process of
commiting
Mar 23, 2011 3:02:45 PM org.apache.hadoop.mapred.LocalJobRunner$Job
statusUpdate
INFO:
Mar 23, 2011 3:02:45 PM org.apache.hadoop.mapred.Task sendDone
INFO: Task 'attempt_local_0005_m_000000_0' done.
Mar 23, 2011 3:02:45 PM org.apache.hadoop.mapred.LocalJobRunner$Job
statusUpdate
INFO:
Mar 23, 2011 3:02:45 PM org.apache.hadoop.mapred.Merger$MergeQueue merge
INFO: Merging 1 sorted segments
Mar 23, 2011 3:02:45 PM org.apache.hadoop.mapred.Merger$MergeQueue merge
INFO: Down to the last merge-pass, with 1 segments left of total size: 42707
bytes
Mar 23, 2011 3:02:45 PM org.apache.hadoop.mapred.LocalJobRunner$Job
statusUpdate
INFO:
Mar 23, 2011 3:02:45 PM org.apache.hadoop.mapred.Task done
INFO: Task:attempt_local_0005_r_000000_0 is done. And is in the process of
commiting
Mar 23, 2011 3:02:45 PM org.apache.hadoop.mapred.LocalJobRunner$Job
statusUpdate
INFO:
Mar 23, 2011 3:02:45 PM org.apache.hadoop.mapred.Task commit
INFO: Task attempt_local_0005_r_000000_0 is allowed to commit now
Mar 23, 2011 3:02:45 PM
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter commitTask
INFO: Saved output of task 'attempt_local_0005_r_000000_0' to
newsClusters/tf-vectors
Mar 23, 2011 3:02:45 PM org.apache.hadoop.mapred.LocalJobRunner$Job
statusUpdate
INFO: reduce > reduce
Mar 23, 2011 3:02:45 PM org.apache.hadoop.mapred.Task sendDone
INFO: Task 'attempt_local_0005_r_000000_0' done.
Mar 23, 2011 3:02:46 PM org.apache.hadoop.mapred.JobClient
monitorAndPrintJob
INFO: map 100% reduce 100%
Mar 23, 2011 3:02:46 PM org.apache.hadoop.mapred.JobClient
monitorAndPrintJob
INFO: Job complete: job_local_0005
Mar 23, 2011 3:02:46 PM org.apache.hadoop.mapred.Counters log
INFO: Counters: 12
Mar 23, 2011 3:02:46 PM org.apache.hadoop.mapred.Counters log
INFO: FileSystemCounters
Mar 23, 2011 3:02:46 PM org.apache.hadoop.mapred.Counters log
INFO: FILE_BYTES_READ=90946183
Mar 23, 2011 3:02:46 PM org.apache.hadoop.mapred.Counters log
INFO: FILE_BYTES_WRITTEN=91678644
Mar 23, 2011 3:02:46 PM org.apache.hadoop.mapred.Counters log
INFO: Map-Reduce Framework
Mar 23, 2011 3:02:46 PM org.apache.hadoop.mapred.Counters log
INFO: Reduce input groups=53
Mar 23, 2011 3:02:46 PM org.apache.hadoop.mapred.Counters log
INFO: Combine output records=0
Mar 23, 2011 3:02:46 PM org.apache.hadoop.mapred.Counters log
INFO: Map input records=53
Mar 23, 2011 3:02:46 PM org.apache.hadoop.mapred.Counters log
INFO: Reduce shuffle bytes=0
Mar 23, 2011 3:02:46 PM org.apache.hadoop.mapred.Counters log
INFO: Reduce output records=53
Mar 23, 2011 3:02:46 PM org.apache.hadoop.mapred.Counters log
INFO: Spilled Records=106
Mar 23, 2011 3:02:46 PM org.apache.hadoop.mapred.Counters log
INFO: Map output bytes=42496
Mar 23, 2011 3:02:46 PM org.apache.hadoop.mapred.Counters log
INFO: Combine input records=0
Mar 23, 2011 3:02:46 PM org.apache.hadoop.mapred.Counters log
INFO: Map output records=53
Mar 23, 2011 3:02:46 PM org.apache.hadoop.mapred.Counters log
INFO: Reduce input records=53
Mar 23, 2011 3:02:46 PM org.slf4j.impl.JCLLoggerAdapter info
INFO: Deleting newsClusters/partial-vectors-0
Mar 23, 2011 3:02:46 PM org.apache.hadoop.metrics.jvm.JvmMetrics init
INFO: Cannot initialize JVM Metrics with processName=JobTracker, sessionId=
- already initialized
Mar 23, 2011 3:02:46 PM org.apache.hadoop.mapred.JobClient
configureCommandLineOptions
WARNING: Use GenericOptionsParser for parsing the arguments. Applications
should implement Tool for the same.
Mar 23, 2011 3:02:46 PM
org.apache.hadoop.mapreduce.lib.input.FileInputFormat listStatus
INFO: Total input paths to process : 1
Mar 23, 2011 3:02:46 PM org.apache.hadoop.mapred.JobClient
monitorAndPrintJob
INFO: Running job: job_local_0006
Mar 23, 2011 3:02:46 PM
org.apache.hadoop.mapreduce.lib.input.FileInputFormat listStatus
INFO: Total input paths to process : 1
Mar 23, 2011 3:02:46 PM org.apache.hadoop.mapred.MapTask$MapOutputBuffer
<init>
INFO: io.sort.mb = 100
Mar 23, 2011 3:02:46 PM org.apache.hadoop.mapred.MapTask$MapOutputBuffer
<init>
INFO: data buffer = 79691776/99614720
Mar 23, 2011 3:02:46 PM org.apache.hadoop.mapred.MapTask$MapOutputBuffer
<init>
INFO: record buffer = 262144/327680
Mar 23, 2011 3:02:46 PM org.apache.hadoop.mapred.MapTask$MapOutputBuffer
flush
INFO: Starting flush of map output
Mar 23, 2011 3:02:46 PM org.apache.hadoop.mapred.MapTask$MapOutputBuffer
sortAndSpill
INFO: Finished spill 0
Mar 23, 2011 3:02:46 PM org.apache.hadoop.mapred.Task done
INFO: Task:attempt_local_0006_m_000000_0 is done. And is in the process of
commiting
Mar 23, 2011 3:02:46 PM org.apache.hadoop.mapred.LocalJobRunner$Job
statusUpdate
INFO:
Mar 23, 2011 3:02:46 PM org.apache.hadoop.mapred.Task sendDone
INFO: Task 'attempt_local_0006_m_000000_0' done.
Mar 23, 2011 3:02:46 PM org.apache.hadoop.mapred.LocalJobRunner$Job
statusUpdate
INFO:
Mar 23, 2011 3:02:46 PM org.apache.hadoop.mapred.Merger$MergeQueue merge
INFO: Merging 1 sorted segments
Mar 23, 2011 3:02:46 PM org.apache.hadoop.mapred.Merger$MergeQueue merge
INFO: Down to the last merge-pass, with 1 segments left of total size: 9914
bytes
Mar 23, 2011 3:02:46 PM org.apache.hadoop.mapred.LocalJobRunner$Job
statusUpdate
INFO:
Mar 23, 2011 3:02:46 PM org.apache.hadoop.mapred.Task done
INFO: Task:attempt_local_0006_r_000000_0 is done. And is in the process of
commiting
Mar 23, 2011 3:02:46 PM org.apache.hadoop.mapred.LocalJobRunner$Job
statusUpdate
INFO:
Mar 23, 2011 3:02:46 PM org.apache.hadoop.mapred.Task commit
INFO: Task attempt_local_0006_r_000000_0 is allowed to commit now
Mar 23, 2011 3:02:46 PM
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter commitTask
INFO: Saved output of task 'attempt_local_0006_r_000000_0' to
newsClusters/df-count
Mar 23, 2011 3:02:46 PM org.apache.hadoop.mapred.LocalJobRunner$Job
statusUpdate
INFO: reduce > reduce
Mar 23, 2011 3:02:46 PM org.apache.hadoop.mapred.Task sendDone
INFO: Task 'attempt_local_0006_r_000000_0' done.
Mar 23, 2011 3:02:47 PM org.apache.hadoop.mapred.JobClient
monitorAndPrintJob
INFO: map 100% reduce 100%
Mar 23, 2011 3:02:47 PM org.apache.hadoop.mapred.JobClient
monitorAndPrintJob
INFO: Job complete: job_local_0006
Mar 23, 2011 3:02:47 PM org.apache.hadoop.mapred.Counters log
INFO: Counters: 12
Mar 23, 2011 3:02:47 PM org.apache.hadoop.mapred.Counters log
INFO: FileSystemCounters
Mar 23, 2011 3:02:47 PM org.apache.hadoop.mapred.Counters log
INFO: FILE_BYTES_READ=108620888
Mar 23, 2011 3:02:47 PM org.apache.hadoop.mapred.Counters log
INFO: FILE_BYTES_WRITTEN=109456559
Mar 23, 2011 3:02:47 PM org.apache.hadoop.mapred.Counters log
INFO: Map-Reduce Framework
Mar 23, 2011 3:02:47 PM org.apache.hadoop.mapred.Counters log
INFO: Reduce input groups=708
Mar 23, 2011 3:02:47 PM org.apache.hadoop.mapred.Counters log
INFO: Combine output records=708
Mar 23, 2011 3:02:47 PM org.apache.hadoop.mapred.Counters log
INFO: Map input records=53
Mar 23, 2011 3:02:47 PM org.apache.hadoop.mapred.Counters log
INFO: Reduce shuffle bytes=0
Mar 23, 2011 3:02:47 PM org.apache.hadoop.mapred.Counters log
INFO: Reduce output records=708
Mar 23, 2011 3:02:47 PM org.apache.hadoop.mapred.Counters log
INFO: Spilled Records=1416
Mar 23, 2011 3:02:47 PM org.apache.hadoop.mapred.Counters log
INFO: Map output bytes=56220
Mar 23, 2011 3:02:47 PM org.apache.hadoop.mapred.Counters log
INFO: Combine input records=4685
Mar 23, 2011 3:02:47 PM org.apache.hadoop.mapred.Counters log
INFO: Map output records=4685
Mar 23, 2011 3:02:47 PM org.apache.hadoop.mapred.Counters log
INFO: Reduce input records=708
Mar 23, 2011 3:02:47 PM org.apache.hadoop.metrics.jvm.JvmMetrics init
INFO: Cannot initialize JVM Metrics with processName=JobTracker, sessionId=
- already initialized
Mar 23, 2011 3:02:47 PM org.apache.hadoop.mapred.JobClient
configureCommandLineOptions
WARNING: Use GenericOptionsParser for parsing the arguments. Applications
should implement Tool for the same.
Mar 23, 2011 3:02:47 PM
org.apache.hadoop.mapreduce.lib.input.FileInputFormat listStatus
INFO: Total input paths to process : 1
Mar 23, 2011 3:02:47 PM org.apache.hadoop.mapred.JobClient
monitorAndPrintJob
INFO: Running job: job_local_0007
Mar 23, 2011 3:02:47 PM
org.apache.hadoop.mapreduce.lib.input.FileInputFormat listStatus
INFO: Total input paths to process : 1
Mar 23, 2011 3:02:47 PM org.apache.hadoop.mapred.MapTask$MapOutputBuffer
<init>
INFO: io.sort.mb = 100
Mar 23, 2011 3:02:47 PM org.apache.hadoop.mapred.MapTask$MapOutputBuffer
<init>
INFO: data buffer = 79691776/99614720
Mar 23, 2011 3:02:47 PM org.apache.hadoop.mapred.MapTask$MapOutputBuffer
<init>
INFO: record buffer = 262144/327680
Mar 23, 2011 3:02:47 PM org.apache.hadoop.mapred.MapTask$MapOutputBuffer
flush
INFO: Starting flush of map output
Mar 23, 2011 3:02:47 PM org.apache.hadoop.mapred.MapTask$MapOutputBuffer
sortAndSpill
INFO: Finished spill 0
Mar 23, 2011 3:02:47 PM org.apache.hadoop.mapred.Task done
INFO: Task:attempt_local_0007_m_000000_0 is done. And is in the process of
commiting
Mar 23, 2011 3:02:47 PM org.apache.hadoop.mapred.LocalJobRunner$Job
statusUpdate
INFO:
Mar 23, 2011 3:02:47 PM org.apache.hadoop.mapred.Task sendDone
INFO: Task 'attempt_local_0007_m_000000_0' done.
Mar 23, 2011 3:02:47 PM org.apache.hadoop.mapred.LocalJobRunner$Job
statusUpdate
INFO:
Mar 23, 2011 3:02:47 PM org.apache.hadoop.mapred.Merger$MergeQueue merge
INFO: Merging 1 sorted segments
Mar 23, 2011 3:02:47 PM org.apache.hadoop.mapred.Merger$MergeQueue merge
INFO: Down to the last merge-pass, with 1 segments left of total size: 42707
bytes
Mar 23, 2011 3:02:47 PM org.apache.hadoop.mapred.LocalJobRunner$Job
statusUpdate
INFO:
Mar 23, 2011 3:02:47 PM org.apache.hadoop.mapred.Task done
INFO: Task:attempt_local_0007_r_000000_0 is done. And is in the process of
commiting
Mar 23, 2011 3:02:47 PM org.apache.hadoop.mapred.LocalJobRunner$Job
statusUpdate
INFO:
Mar 23, 2011 3:02:47 PM org.apache.hadoop.mapred.Task commit
INFO: Task attempt_local_0007_r_000000_0 is allowed to commit now
Mar 23, 2011 3:02:47 PM
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter commitTask
INFO: Saved output of task 'attempt_local_0007_r_000000_0' to
newsClusters/partial-vectors-0
Mar 23, 2011 3:02:47 PM org.apache.hadoop.mapred.LocalJobRunner$Job
statusUpdate
INFO: reduce > reduce
Mar 23, 2011 3:02:47 PM org.apache.hadoop.mapred.Task sendDone
INFO: Task 'attempt_local_0007_r_000000_0' done.
Mar 23, 2011 3:02:48 PM org.apache.hadoop.mapred.JobClient
monitorAndPrintJob
INFO: map 100% reduce 100%
Mar 23, 2011 3:02:48 PM org.apache.hadoop.mapred.JobClient
monitorAndPrintJob
INFO: Job complete: job_local_0007
Mar 23, 2011 3:02:48 PM org.apache.hadoop.mapred.Counters log
INFO: Counters: 12
Mar 23, 2011 3:02:48 PM org.apache.hadoop.mapred.Counters log
INFO: FileSystemCounters
Mar 23, 2011 3:02:48 PM org.apache.hadoop.mapred.Counters log
INFO: FILE_BYTES_READ=126340072
Mar 23, 2011 3:02:48 PM org.apache.hadoop.mapred.Counters log
INFO: FILE_BYTES_WRITTEN=127288605
Mar 23, 2011 3:02:48 PM org.apache.hadoop.mapred.Counters log
INFO: Map-Reduce Framework
Mar 23, 2011 3:02:48 PM org.apache.hadoop.mapred.Counters log
INFO: Reduce input groups=53
Mar 23, 2011 3:02:48 PM org.apache.hadoop.mapred.Counters log
INFO: Combine output records=0
Mar 23, 2011 3:02:48 PM org.apache.hadoop.mapred.Counters log
INFO: Map input records=53
Mar 23, 2011 3:02:48 PM org.apache.hadoop.mapred.Counters log
INFO: Reduce shuffle bytes=0
Mar 23, 2011 3:02:48 PM org.apache.hadoop.mapred.Counters log
INFO: Reduce output records=53
Mar 23, 2011 3:02:48 PM org.apache.hadoop.mapred.Counters log
INFO: Spilled Records=106
Mar 23, 2011 3:02:48 PM org.apache.hadoop.mapred.Counters log
INFO: Map output bytes=42496
Mar 23, 2011 3:02:48 PM org.apache.hadoop.mapred.Counters log
INFO: Combine input records=0
Mar 23, 2011 3:02:48 PM org.apache.hadoop.mapred.Counters log
INFO: Map output records=53
Mar 23, 2011 3:02:48 PM org.apache.hadoop.mapred.Counters log
INFO: Reduce input records=53
Mar 23, 2011 3:02:48 PM org.apache.hadoop.metrics.jvm.JvmMetrics init
INFO: Cannot initialize JVM Metrics with processName=JobTracker, sessionId=
- already initialized
Mar 23, 2011 3:02:48 PM org.apache.hadoop.mapred.JobClient
configureCommandLineOptions
WARNING: Use GenericOptionsParser for parsing the arguments. Applications
should implement Tool for the same.
Mar 23, 2011 3:02:48 PM
org.apache.hadoop.mapreduce.lib.input.FileInputFormat listStatus
INFO: Total input paths to process : 1
Mar 23, 2011 3:02:48 PM org.apache.hadoop.mapred.JobClient
monitorAndPrintJob
INFO: Running job: job_local_0008
Mar 23, 2011 3:02:48 PM
org.apache.hadoop.mapreduce.lib.input.FileInputFormat listStatus
INFO: Total input paths to process : 1
Mar 23, 2011 3:02:48 PM org.apache.hadoop.mapred.MapTask$MapOutputBuffer
<init>
INFO: io.sort.mb = 100
Mar 23, 2011 3:02:48 PM org.apache.hadoop.mapred.MapTask$MapOutputBuffer
<init>
INFO: data buffer = 79691776/99614720
Mar 23, 2011 3:02:48 PM org.apache.hadoop.mapred.MapTask$MapOutputBuffer
<init>
INFO: record buffer = 262144/327680
Mar 23, 2011 3:02:48 PM org.apache.hadoop.mapred.MapTask$MapOutputBuffer
flush
INFO: Starting flush of map output
Mar 23, 2011 3:02:48 PM org.apache.hadoop.mapred.MapTask$MapOutputBuffer
sortAndSpill
INFO: Finished spill 0
Mar 23, 2011 3:02:48 PM org.apache.hadoop.mapred.Task done
INFO: Task:attempt_local_0008_m_000000_0 is done. And is in the process of
commiting
Mar 23, 2011 3:02:48 PM org.apache.hadoop.mapred.LocalJobRunner$Job
statusUpdate
INFO:
Mar 23, 2011 3:02:48 PM org.apache.hadoop.mapred.Task sendDone
INFO: Task 'attempt_local_0008_m_000000_0' done.
Mar 23, 2011 3:02:49 PM org.apache.hadoop.mapred.LocalJobRunner$Job
statusUpdate
INFO:
Mar 23, 2011 3:02:49 PM org.apache.hadoop.mapred.Merger$MergeQueue merge
INFO: Merging 1 sorted segments
Mar 23, 2011 3:02:49 PM org.apache.hadoop.mapred.Merger$MergeQueue merge
INFO: Down to the last merge-pass, with 1 segments left of total size: 907
bytes
Mar 23, 2011 3:02:49 PM org.apache.hadoop.mapred.LocalJobRunner$Job
statusUpdate
INFO:
Mar 23, 2011 3:02:49 PM org.apache.hadoop.mapred.Task done
INFO: Task:attempt_local_0008_r_000000_0 is done. And is in the process of
commiting
Mar 23, 2011 3:02:49 PM org.apache.hadoop.mapred.LocalJobRunner$Job
statusUpdate
INFO:
Mar 23, 2011 3:02:49 PM org.apache.hadoop.mapred.Task commit
INFO: Task attempt_local_0008_r_000000_0 is allowed to commit now
Mar 23, 2011 3:02:49 PM
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter commitTask
INFO: Saved output of task 'attempt_local_0008_r_000000_0' to
newsClusters/tfidf-vectors
Mar 23, 2011 3:02:49 PM org.apache.hadoop.mapred.LocalJobRunner$Job
statusUpdate
INFO: reduce > reduce
Mar 23, 2011 3:02:49 PM org.apache.hadoop.mapred.Task sendDone
INFO: Task 'attempt_local_0008_r_000000_0' done.
Mar 23, 2011 3:02:49 PM org.apache.hadoop.mapred.JobClient
monitorAndPrintJob
INFO: map 100% reduce 100%
Mar 23, 2011 3:02:49 PM org.apache.hadoop.mapred.JobClient
monitorAndPrintJob
INFO: Job complete: job_local_0008
Mar 23, 2011 3:02:49 PM org.apache.hadoop.mapred.Counters log
INFO: Counters: 12
Mar 23, 2011 3:02:49 PM org.apache.hadoop.mapred.Counters log
INFO: FileSystemCounters
Mar 23, 2011 3:02:49 PM org.apache.hadoop.mapred.Counters log
INFO: FILE_BYTES_READ=143935937
Mar 23, 2011 3:02:49 PM org.apache.hadoop.mapred.Counters log
INFO: FILE_BYTES_WRITTEN=144993725
Mar 23, 2011 3:02:49 PM org.apache.hadoop.mapred.Counters log
INFO: Map-Reduce Framework
Mar 23, 2011 3:02:49 PM org.apache.hadoop.mapred.Counters log
INFO: Reduce input groups=53
Mar 23, 2011 3:02:49 PM org.apache.hadoop.mapred.Counters log
INFO: Combine output records=0
Mar 23, 2011 3:02:49 PM org.apache.hadoop.mapred.Counters log
INFO: Map input records=53
Mar 23, 2011 3:02:49 PM org.apache.hadoop.mapred.Counters log
INFO: Reduce shuffle bytes=0
Mar 23, 2011 3:02:49 PM org.apache.hadoop.mapred.Counters log
INFO: Reduce output records=53
Mar 23, 2011 3:02:49 PM org.apache.hadoop.mapred.Counters log
INFO: Spilled Records=106
Mar 23, 2011 3:02:49 PM org.apache.hadoop.mapred.Counters log
INFO: Map output bytes=799
Mar 23, 2011 3:02:49 PM org.apache.hadoop.mapred.Counters log
INFO: Combine input records=0
Mar 23, 2011 3:02:49 PM org.apache.hadoop.mapred.Counters log
INFO: Map output records=53
Mar 23, 2011 3:02:49 PM org.apache.hadoop.mapred.Counters log
INFO: Reduce input records=53
Mar 23, 2011 3:02:49 PM org.slf4j.impl.JCLLoggerAdapter info
INFO: Deleting newsClusters/partial-vectors-0
Mar 23, 2011 3:02:49 PM org.slf4j.impl.JCLLoggerAdapter info
INFO: Build Clusters Input: newsClusters/tfidf-vectors Out:
newsClusters/canopy-centroids Measure:
org.apache.mahout.common.distance.EuclideanDistanceMeasure@903025 t1: 250.0
t2: 120.0
Mar 23, 2011 3:02:49 PM org.apache.hadoop.metrics.jvm.JvmMetrics init
INFO: Cannot initialize JVM Metrics with processName=JobTracker, sessionId=
- already initialized
Mar 23, 2011 3:02:49 PM org.apache.hadoop.mapred.JobClient
configureCommandLineOptions
WARNING: Use GenericOptionsParser for parsing the arguments. Applications
should implement Tool for the same.
Mar 23, 2011 3:02:50 PM
org.apache.hadoop.mapreduce.lib.input.FileInputFormat listStatus
INFO: Total input paths to process : 1
Mar 23, 2011 3:02:50 PM org.apache.hadoop.mapred.JobClient
monitorAndPrintJob
INFO: Running job: job_local_0009
Mar 23, 2011 3:02:50 PM
org.apache.hadoop.mapreduce.lib.input.FileInputFormat listStatus
INFO: Total input paths to process : 1
Mar 23, 2011 3:02:50 PM org.apache.hadoop.mapred.MapTask$MapOutputBuffer
<init>
INFO: io.sort.mb = 100
Mar 23, 2011 3:02:50 PM org.apache.hadoop.mapred.MapTask$MapOutputBuffer
<init>
INFO: data buffer = 79691776/99614720
Mar 23, 2011 3:02:50 PM org.apache.hadoop.mapred.MapTask$MapOutputBuffer
<init>
INFO: record buffer = 262144/327680
Mar 23, 2011 3:02:50 PM org.apache.hadoop.mapred.MapTask$MapOutputBuffer
flush
INFO: Starting flush of map output
Mar 23, 2011 3:02:50 PM org.apache.hadoop.mapred.MapTask$MapOutputBuffer
sortAndSpill
INFO: Finished spill 0
Mar 23, 2011 3:02:50 PM org.apache.hadoop.mapred.Task done
INFO: Task:attempt_local_0009_m_000000_0 is done. And is in the process of
commiting
Mar 23, 2011 3:02:50 PM org.apache.hadoop.mapred.LocalJobRunner$Job
statusUpdate
INFO:
Mar 23, 2011 3:02:50 PM org.apache.hadoop.mapred.Task sendDone
INFO: Task 'attempt_local_0009_m_000000_0' done.
Mar 23, 2011 3:02:50 PM org.apache.hadoop.mapred.LocalJobRunner$Job
statusUpdate
INFO:
Mar 23, 2011 3:02:50 PM org.apache.hadoop.mapred.Merger$MergeQueue merge
INFO: Merging 1 sorted segments
Mar 23, 2011 3:02:50 PM org.apache.hadoop.mapred.Merger$MergeQueue merge
INFO: Down to the last merge-pass, with 1 segments left of total size: 17
bytes
Mar 23, 2011 3:02:50 PM org.apache.hadoop.mapred.LocalJobRunner$Job
statusUpdate
INFO:
Mar 23, 2011 3:02:50 PM org.apache.hadoop.mapred.Task done
INFO: Task:attempt_local_0009_r_000000_0 is done. And is in the process of
commiting
Mar 23, 2011 3:02:50 PM org.apache.hadoop.mapred.LocalJobRunner$Job
statusUpdate
INFO:
Mar 23, 2011 3:02:50 PM org.apache.hadoop.mapred.Task commit
INFO: Task attempt_local_0009_r_000000_0 is allowed to commit now
Mar 23, 2011 3:02:50 PM
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter commitTask
INFO: Saved output of task 'attempt_local_0009_r_000000_0' to
newsClusters/canopy-centroids/clusters-0
Mar 23, 2011 3:02:50 PM org.apache.hadoop.mapred.LocalJobRunner$Job
statusUpdate
INFO: reduce > reduce
Mar 23, 2011 3:02:50 PM org.apache.hadoop.mapred.Task sendDone
INFO: Task 'attempt_local_0009_r_000000_0' done.
Mar 23, 2011 3:02:51 PM org.apache.hadoop.mapred.JobClient
monitorAndPrintJob
INFO: map 100% reduce 100%
Mar 23, 2011 3:02:51 PM org.apache.hadoop.mapred.JobClient
monitorAndPrintJob
INFO: Job complete: job_local_0009
Mar 23, 2011 3:02:51 PM org.apache.hadoop.mapred.Counters log
INFO: Counters: 12
Mar 23, 2011 3:02:51 PM org.apache.hadoop.mapred.Counters log
INFO: FileSystemCounters
Mar 23, 2011 3:02:51 PM org.apache.hadoop.mapred.Counters log
INFO: FILE_BYTES_READ=161475007
Mar 23, 2011 3:02:51 PM org.apache.hadoop.mapred.Counters log
INFO: FILE_BYTES_WRITTEN=162703107
Mar 23, 2011 3:02:51 PM org.apache.hadoop.mapred.Counters log
INFO: Map-Reduce Framework
Mar 23, 2011 3:02:51 PM org.apache.hadoop.mapred.Counters log
INFO: Reduce input groups=1
Mar 23, 2011 3:02:51 PM org.apache.hadoop.mapred.Counters log
INFO: Combine output records=0
Mar 23, 2011 3:02:51 PM org.apache.hadoop.mapred.Counters log
INFO: Map input records=53
Mar 23, 2011 3:02:51 PM org.apache.hadoop.mapred.Counters log
INFO: Reduce shuffle bytes=0
Mar 23, 2011 3:02:51 PM org.apache.hadoop.mapred.Counters log
INFO: Reduce output records=1
Mar 23, 2011 3:02:51 PM org.apache.hadoop.mapred.Counters log
INFO: Spilled Records=2
Mar 23, 2011 3:02:51 PM org.apache.hadoop.mapred.Counters log
INFO: Map output bytes=13
Mar 23, 2011 3:02:51 PM org.apache.hadoop.mapred.Counters log
INFO: Combine input records=0
Mar 23, 2011 3:02:51 PM org.apache.hadoop.mapred.Counters log
INFO: Map output records=1
Mar 23, 2011 3:02:51 PM org.apache.hadoop.mapred.Counters log
INFO: Reduce input records=1
Mar 23, 2011 3:02:51 PM org.slf4j.impl.JCLLoggerAdapter info
INFO: Input: newsClusters/tfidf-vectors Clusters In:
newsClusters/canopy-centroids/clusters-0 Out: newsClusters/clusters
Distance: org.apache.mahout.common.distance.TanimotoDistanceMeasure
Mar 23, 2011 3:02:51 PM org.slf4j.impl.JCLLoggerAdapter info
INFO: convergence: 0.01 max Iterations: 20 num Reduce Tasks:
org.apache.mahout.math.VectorWritable Input Vectors: {}
Mar 23, 2011 3:02:51 PM org.slf4j.impl.JCLLoggerAdapter info
INFO: K-Means Iteration 1
Mar 23, 2011 3:02:51 PM org.apache.hadoop.metrics.jvm.JvmMetrics init
INFO: Cannot initialize JVM Metrics with processName=JobTracker, sessionId=
- already initialized
Mar 23, 2011 3:02:51 PM org.apache.hadoop.mapred.JobClient
configureCommandLineOptions
WARNING: Use GenericOptionsParser for parsing the arguments. Applications
should implement Tool for the same.
Mar 23, 2011 3:02:51 PM
org.apache.hadoop.mapreduce.lib.input.FileInputFormat listStatus
INFO: Total input paths to process : 1
Mar 23, 2011 3:02:51 PM org.apache.hadoop.mapred.JobClient
monitorAndPrintJob
INFO: Running job: job_local_0010
Mar 23, 2011 3:02:51 PM
org.apache.hadoop.mapreduce.lib.input.FileInputFormat listStatus
INFO: Total input paths to process : 1
Mar 23, 2011 3:02:51 PM org.apache.hadoop.mapred.MapTask$MapOutputBuffer
<init>
INFO: io.sort.mb = 100
Mar 23, 2011 3:02:51 PM org.apache.hadoop.mapred.MapTask$MapOutputBuffer
<init>
INFO: data buffer = 79691776/99614720
Mar 23, 2011 3:02:51 PM org.apache.hadoop.mapred.MapTask$MapOutputBuffer
<init>
INFO: record buffer = 262144/327680
Mar 23, 2011 3:02:51 PM org.apache.hadoop.mapred.MapTask$MapOutputBuffer
flush
INFO: Starting flush of map output
Mar 23, 2011 3:02:51 PM org.apache.hadoop.mapred.MapTask$MapOutputBuffer
sortAndSpill
INFO: Finished spill 0
Mar 23, 2011 3:02:51 PM org.apache.hadoop.mapred.Task done
INFO: Task:attempt_local_0010_m_000000_0 is done. And is in the process of
commiting
Mar 23, 2011 3:02:51 PM org.apache.hadoop.mapred.LocalJobRunner$Job
statusUpdate
INFO:
Mar 23, 2011 3:02:51 PM org.apache.hadoop.mapred.Task sendDone
INFO: Task 'attempt_local_0010_m_000000_0' done.
Mar 23, 2011 3:02:51 PM org.apache.hadoop.mapred.LocalJobRunner$Job
statusUpdate
INFO:
Mar 23, 2011 3:02:51 PM org.apache.hadoop.mapred.Merger$MergeQueue merge
INFO: Merging 1 sorted segments
Mar 23, 2011 3:02:51 PM org.apache.hadoop.mapred.Merger$MergeQueue merge
INFO: Down to the last merge-pass, with 1 segments left of total size: 29
bytes
Mar 23, 2011 3:02:51 PM org.apache.hadoop.mapred.LocalJobRunner$Job
statusUpdate
INFO:
Mar 23, 2011 3:02:51 PM org.apache.hadoop.mapred.Task done
INFO: Task:attempt_local_0010_r_000000_0 is done. And is in the process of
commiting
Mar 23, 2011 3:02:51 PM org.apache.hadoop.mapred.LocalJobRunner$Job
statusUpdate
INFO:
Mar 23, 2011 3:02:51 PM org.apache.hadoop.mapred.Task commit
INFO: Task attempt_local_0010_r_000000_0 is allowed to commit now
Mar 23, 2011 3:02:51 PM
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter commitTask
INFO: Saved output of task 'attempt_local_0010_r_000000_0' to
newsClusters/clusters/clusters-1
Mar 23, 2011 3:02:51 PM org.apache.hadoop.mapred.LocalJobRunner$Job
statusUpdate
INFO: reduce > reduce
Mar 23, 2011 3:02:51 PM org.apache.hadoop.mapred.Task sendDone
INFO: Task 'attempt_local_0010_r_000000_0' done.
Mar 23, 2011 3:02:52 PM org.apache.hadoop.mapred.JobClient
monitorAndPrintJob
INFO: map 100% reduce 100%
Mar 23, 2011 3:02:52 PM org.apache.hadoop.mapred.JobClient
monitorAndPrintJob
INFO: Job complete: job_local_0010
Mar 23, 2011 3:02:52 PM org.apache.hadoop.mapred.Counters log
INFO: Counters: 13
Mar 23, 2011 3:02:52 PM org.apache.hadoop.mapred.Counters log
INFO: Clustering
Mar 23, 2011 3:02:52 PM org.apache.hadoop.mapred.Counters log
INFO: Converged Clusters=1
Mar 23, 2011 3:02:52 PM org.apache.hadoop.mapred.Counters log
INFO: FileSystemCounters
Mar 23, 2011 3:02:52 PM org.apache.hadoop.mapred.Counters log
INFO: FILE_BYTES_READ=179033398
Mar 23, 2011 3:02:52 PM org.apache.hadoop.mapred.Counters log
INFO: FILE_BYTES_WRITTEN=180412134
Mar 23, 2011 3:02:52 PM org.apache.hadoop.mapred.Counters log
INFO: Map-Reduce Framework
Mar 23, 2011 3:02:52 PM org.apache.hadoop.mapred.Counters log
INFO: Reduce input groups=1
Mar 23, 2011 3:02:52 PM org.apache.hadoop.mapred.Counters log
INFO: Combine output records=1
Mar 23, 2011 3:02:52 PM org.apache.hadoop.mapred.Counters log
INFO: Map input records=53
Mar 23, 2011 3:02:52 PM org.apache.hadoop.mapred.Counters log
INFO: Reduce shuffle bytes=0
Mar 23, 2011 3:02:52 PM org.apache.hadoop.mapred.Counters log
INFO: Reduce output records=1
Mar 23, 2011 3:02:52 PM org.apache.hadoop.mapred.Counters log
INFO: Spilled Records=2
Mar 23, 2011 3:02:52 PM org.apache.hadoop.mapred.Counters log
INFO: Map output bytes=1325
Mar 23, 2011 3:02:52 PM org.apache.hadoop.mapred.Counters log
INFO: Combine input records=53
Mar 23, 2011 3:02:52 PM org.apache.hadoop.mapred.Counters log
INFO: Map output records=53
Mar 23, 2011 3:02:52 PM org.apache.hadoop.mapred.Counters log
INFO: Reduce input records=1
Mar 23, 2011 3:02:52 PM org.slf4j.impl.JCLLoggerAdapter info
INFO: Clustering data
Mar 23, 2011 3:02:52 PM org.slf4j.impl.JCLLoggerAdapter info
INFO: Running Clustering
Mar 23, 2011 3:02:52 PM org.slf4j.impl.JCLLoggerAdapter info
INFO: Input: newsClusters/tfidf-vectors Clusters In:
newsClusters/clusters/clusters-1 Out: newsClusters/clusters/clusteredPoints
Distance: org.apache.mahout.common.distance.TanimotoDistanceMeasure@1958cc2
Mar 23, 2011 3:02:52 PM org.slf4j.impl.JCLLoggerAdapter info
INFO: convergence: 0.01 Input Vectors: org.apache.mahout.math.VectorWritable
Mar 23, 2011 3:02:52 PM org.apache.hadoop.metrics.jvm.JvmMetrics init
INFO: Cannot initialize JVM Metrics with processName=JobTracker, sessionId=
- already initialized
Mar 23, 2011 3:02:52 PM org.apache.hadoop.mapred.JobClient
configureCommandLineOptions
WARNING: Use GenericOptionsParser for parsing the arguments. Applications
should implement Tool for the same.
Mar 23, 2011 3:02:52 PM
org.apache.hadoop.mapreduce.lib.input.FileInputFormat listStatus
INFO: Total input paths to process : 1
Mar 23, 2011 3:02:52 PM org.apache.hadoop.mapred.JobClient
monitorAndPrintJob
INFO: Running job: job_local_0011
Mar 23, 2011 3:02:52 PM
org.apache.hadoop.mapreduce.lib.input.FileInputFormat listStatus
INFO: Total input paths to process : 1
Mar 23, 2011 3:02:52 PM org.apache.hadoop.mapred.Task done
INFO: Task:attempt_local_0011_m_000000_0 is done. And is in the process of
commiting
Mar 23, 2011 3:02:52 PM org.apache.hadoop.mapred.LocalJobRunner$Job
statusUpdate
INFO:
Mar 23, 2011 3:02:52 PM org.apache.hadoop.mapred.Task commit
INFO: Task attempt_local_0011_m_000000_0 is allowed to commit now
Mar 23, 2011 3:02:52 PM
org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter commitTask
INFO: Saved output of task 'attempt_local_0011_m_000000_0' to
newsClusters/clusters/clusteredPoints
Mar 23, 2011 3:02:52 PM org.apache.hadoop.mapred.LocalJobRunner$Job
statusUpdate
INFO:
Mar 23, 2011 3:02:52 PM org.apache.hadoop.mapred.Task sendDone
INFO: Task 'attempt_local_0011_m_000000_0' done.
Mar 23, 2011 3:02:53 PM org.apache.hadoop.mapred.JobClient
monitorAndPrintJob
INFO: map 100% reduce 0%
Mar 23, 2011 3:02:53 PM org.apache.hadoop.mapred.JobClient
monitorAndPrintJob
INFO: Job complete: job_local_0011
Mar 23, 2011 3:02:53 PM org.apache.hadoop.mapred.Counters log
INFO: Counters: 5
0 belongs to cluster 1.0: []
0 belongs to cluster 1.0: []
0 belongs to cluster 1.0: []
0 belongs to cluster 1.0: []
0 belongs to cluster 1.0: []
0 belongs to cluster 1.0: []
0 belongs to cluster 1.0: []
0 belongs to cluster 1.0: []
0 belongs to cluster 1.0: []
0 belongs to cluster 1.0: []
0 belongs to cluster 1.0: []
0 belongs to cluster 1.0: []
0 belongs to cluster 1.0: []
0 belongs to cluster 1.0: []
0 belongs to cluster 1.0: []
0 belongs to cluster 1.0: []
0 belongs to cluster 1.0: []
0 belongs to cluster 1.0: []
0 belongs to cluster 1.0: []
0 belongs to cluster 1.0: []
0 belongs to cluster 1.0: []
0 belongs to cluster 1.0: []
0 belongs to cluster 1.0: []
0 belongs to cluster 1.0: []
0 belongs to cluster 1.0: []
0 belongs to cluster 1.0: []
0 belongs to cluster 1.0: []
0 belongs to cluster 1.0: []
0 belongs to cluster 1.0: []
0 belongs to cluster 1.0: []
0 belongs to cluster 1.0: []
0 belongs to cluster 1.0: []
0 belongs to cluster 1.0: []
0 belongs to cluster 1.0: []
0 belongs to cluster 1.0: []
0 belongs to cluster 1.0: []
0 belongs to cluster 1.0: []
0 belongs to cluster 1.0: []
0 belongs to cluster 1.0: []
0 belongs to cluster 1.0: []
0 belongs to cluster 1.0: []
0 belongs to cluster 1.0: []
0 belongs to cluster 1.0: []
0 belongs to cluster 1.0: []
Mar 23, 2011 3:02:53 PM org.apache.hadoop.mapred.Counters log
0 belongs to cluster 1.0: []
0 belongs to cluster 1.0: []
0 belongs to cluster 1.0: []
INFO: FileSystemCounters
0 belongs to cluster 1.0: []
0 belongs to cluster 1.0: []
0 belongs to cluster 1.0: []
Mar 23, 2011 3:02:53 PM org.apache.hadoop.mapred.Counters log
0 belongs to cluster 1.0: []
0 belongs to cluster 1.0: []
INFO: FILE_BYTES_READ=98289121
0 belongs to cluster 1.0: []
Mar 23, 2011 3:02:53 PM org.apache.hadoop.mapred.Counters log
INFO: FILE_BYTES_WRITTEN=99057511
Mar 23, 2011 3:02:53 PM org.apache.hadoop.mapred.Counters log
INFO: Map-Reduce Framework
Mar 23, 2011 3:02:53 PM org.apache.hadoop.mapred.Counters log
INFO: Map input records=53
Mar 23, 2011 3:02:53 PM org.apache.hadoop.mapred.Counters log
INFO: Spilled Records=0
Mar 23, 2011 3:02:53 PM org.apache.hadoop.mapred.Counters log
INFO: Map output records=53
BUILD SUCCESSFUL (total time: 15 seconds)

Mime
View raw message