mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Eastman <>
Subject Odd Behavior
Date Fri, 22 Jul 2011 19:13:47 GMT
I'm running the mean shift canopy driver over a pretty slow VPN connection and it appears to
be resubmitting the job.jar for each iteration. When I run

./bin/mahout org.apache.mahout.clustering.meanshift.MeanShiftCanopyDriver -Dmapred.reduce.tasks=3
-i syntheticControl -o output -ic true -ow -x 10 -dm org.apache.mahout.common.distance.EuclideanDistanceMeasure
-cd 0.0001 -t1 47.6 -t2 1 -cl

... it prints out the first iteration citation to the transcript immediately, then delays
for a minute or two to upload the jar, then runs the iteration, then displays the next iteration
citation immediately and delays for each iteration. It looks to me like bin/mahout is running
the driver locally, and each job submission from it is getting invoked remotely on the cluster.
On the fast network in the office I never noticed this before. Is this typical?

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message