Return-Path: X-Original-To: apmail-mahout-dev-archive@www.apache.org Delivered-To: apmail-mahout-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id F11A87B63 for ; Tue, 9 Aug 2011 17:08:18 +0000 (UTC) Received: (qmail 3601 invoked by uid 500); 9 Aug 2011 17:08:18 -0000 Delivered-To: apmail-mahout-dev-archive@mahout.apache.org Received: (qmail 3522 invoked by uid 500); 9 Aug 2011 17:08:17 -0000 Mailing-List: contact dev-help@mahout.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@mahout.apache.org Delivered-To: mailing list dev@mahout.apache.org Received: (qmail 3514 invoked by uid 99); 9 Aug 2011 17:08:17 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 09 Aug 2011 17:08:17 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of ericfrankskinner@gmail.com designates 209.85.220.170 as permitted sender) Received: from [209.85.220.170] (HELO mail-vx0-f170.google.com) (209.85.220.170) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 09 Aug 2011 17:08:11 +0000 Received: by vxh24 with SMTP id 24so366104vxh.1 for ; Tue, 09 Aug 2011 10:07:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; bh=UZ+yfSQCbojppkn37XpnKApE+pW4gdZWkLnez5y6578=; b=DuwjUSQ82CGp5u7irPPw4ZjOvuv5gn6hsiDi+mXLldM+t13CxZI8tCPoxadp3ilnXs oacun9X/v3RzBTTfQPoJk+lNx30a2Bv0ICUaP4kxyGb2lETZEoyw885t1ICDM/8dHZ7G wcWU+R2aY/D58KZ/w3xvxoxj5mpb5xGvWeznk= MIME-Version: 1.0 Received: by 10.52.77.102 with SMTP id r6mr7888603vdw.249.1312909670462; Tue, 09 Aug 2011 10:07:50 -0700 (PDT) Received: by 10.52.163.230 with HTTP; Tue, 9 Aug 2011 10:07:50 -0700 (PDT) Date: Tue, 9 Aug 2011 13:07:50 -0400 Message-ID: Subject: Is this a bug or a setup issue for using NewsKMeasnClustering.java From: eric skinner To: dev@mahout.apache.org Content-Type: multipart/alternative; boundary=bcaec501604187b3de04aa159a02 --bcaec501604187b3de04aa159a02 Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: quoted-printable Hello, I am practicing the NewsKMeansClustering.java, an example code given in chapter 9 of Mahout-in-Action? I run this program against a directory of sequence files. The output error message is as follows: Exception in thread "main" java.io.FileNotFoundException:* File newsClusters/clustersclusteredPoints/part-m-00000 does not exist*. at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.ja= va:361) at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:2= 45) at org.apache.hadoop.fs.FileSystem.getLength(FileSystem.java:676) at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1417) at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1412) at mia.clustering.ch09.NewsKMeansClustering.main(NewsKMeansClustering.java:76) As reference, the directory structure of the result generated after running this program is shown as follows as well: ~/workspaceMahout1/recommender/newsClusters% ls canopy-centroids clusters df-count dictionary.file-0 frequency.file-0 tfidf-vectors tf-vectors tokenized-documents wordcount ~/workspaceMahout1/recommender/newsClusters/clusters/clusteredPoints% ls part-m-00000 Afterwards, I change the code from the original one new Path(clusterOutput+Cluster.CLUSTERED_POINTS_DIR +=94/part-m-00000=94), = conf); to *new Path(clusterOutput+=94/clusteredPoints=94+=94/part-m-00000=94), conf);= * The program can go through without giving the above error messages. I would like to know is that a bug in the original code or are there any other hidden issues? --bcaec501604187b3de04aa159a02--