From user-return-3664-apmail-mahout-user-archive=mahout.apache.org@mahout.apache.org Thu Jun 10 12:29:09 2010 Return-Path: Delivered-To: apmail-mahout-user-archive@www.apache.org Received: (qmail 29754 invoked from network); 10 Jun 2010 12:29:09 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 10 Jun 2010 12:29:09 -0000 Received: (qmail 51822 invoked by uid 500); 10 Jun 2010 12:29:09 -0000 Delivered-To: apmail-mahout-user-archive@mahout.apache.org Received: (qmail 51652 invoked by uid 500); 10 Jun 2010 12:29:07 -0000 Mailing-List: contact user-help@mahout.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@mahout.apache.org Delivered-To: mailing list user@mahout.apache.org Received: (qmail 51644 invoked by uid 99); 10 Jun 2010 12:29:06 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 10 Jun 2010 12:29:06 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=FREEMAIL_FROM,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of mrkrisjack@gmail.com designates 209.85.161.42 as permitted sender) Received: from [209.85.161.42] (HELO mail-fx0-f42.google.com) (209.85.161.42) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 10 Jun 2010 12:28:58 +0000 Received: by fxm5 with SMTP id 5so2514202fxm.1 for ; Thu, 10 Jun 2010 05:28:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:mime-version:received:from:date :message-id:subject:to:content-type; bh=FTIWqTztztw8PzOy+PMJXhBmqeRJxHFMlk4wQR7UKA4=; b=GSeybjQ/OcaODB7EmwhrJs3wos/9e6Tu/DX6DD4IoPDisPTzuh0XqXdK6uM12ep7BH vhrk7K/IgDXUeB8dliAdaN5BFh5LMdFnoFdcWPZ5IrM3H/YetK9OmP4i6cVywUCSr80V 8D3k8vLrdTFQoVmd6heZN1V+xk2wDrJoRt8zE= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:from:date:message-id:subject:to:content-type; b=GzBk8gsBXwTNTzWyXnRf5oj5AbwHD8k2rrbs2sZa7DO8SVLuJABuQyngjpdqgchUBx fXo+/u0BVogVV8lbYaYOLuTY7SVJv/+3F65xOubL8acMZBQqa5dSiRVxfDJQISQxC1AO qeC4qfRu8G5+RNWRounXW3mFXzNtLjoMylps8= Received: by 10.223.16.207 with SMTP id p15mr99277faa.99.1276172918415; Thu, 10 Jun 2010 05:28:38 -0700 (PDT) MIME-Version: 1.0 Received: by 10.223.121.200 with HTTP; Thu, 10 Jun 2010 05:28:13 -0700 (PDT) From: Kris Jack Date: Thu, 10 Jun 2010 13:28:13 +0100 Message-ID: Subject: java.lang.ClassCastException: org.apache.hadoop.io.LongWritable cannot be cast to org.apache.hadoop.io.IntWritable To: user@mahout.apache.org Content-Type: multipart/alternative; boundary=00151747b1b87960730488ac2950 X-Virus-Checked: Checked by ClamAV on apache.org --00151747b1b87960730488ac2950 Content-Type: text/plain; charset=ISO-8859-1 In the attempt to create a document-document similarity matrix, I am getting the following error: SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details. 10-Jun-2010 13:25:04 org.apache.hadoop.metrics.jvm.JvmMetrics init INFO: Initializing JVM Metrics with processName=JobTracker, sessionId= 10-Jun-2010 13:25:04 org.apache.hadoop.mapred.JobClient configureCommandLineOptions WARNING: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. 10-Jun-2010 13:25:04 org.apache.hadoop.mapred.JobClient configureCommandLineOptions WARNING: No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String). 10-Jun-2010 13:25:04 org.apache.hadoop.mapred.FileInputFormat listStatus INFO: Total input paths to process : 1 10-Jun-2010 13:25:05 org.apache.hadoop.mapred.JobClient monitorAndPrintJob INFO: Running job: job_local_0001 10-Jun-2010 13:25:05 org.apache.hadoop.mapred.FileInputFormat listStatus INFO: Total input paths to process : 1 10-Jun-2010 13:25:05 org.apache.hadoop.util.NativeCodeLoader WARNING: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 10-Jun-2010 13:25:05 org.apache.hadoop.io.compress.CodecPool getDecompressor INFO: Got brand-new decompressor 10-Jun-2010 13:25:05 org.apache.hadoop.mapred.MapTask runOldMapper INFO: numReduceTasks: 1 10-Jun-2010 13:25:05 org.apache.hadoop.mapred.MapTask$MapOutputBuffer INFO: io.sort.mb = 100 10-Jun-2010 13:25:05 org.apache.hadoop.mapred.MapTask$MapOutputBuffer INFO: data buffer = 79691776/99614720 10-Jun-2010 13:25:05 org.apache.hadoop.mapred.MapTask$MapOutputBuffer INFO: record buffer = 262144/327680 10-Jun-2010 13:25:05 org.apache.hadoop.mapred.LocalJobRunner$Job run WARNING: job_local_0001 java.lang.ClassCastException: org.apache.hadoop.io.LongWritable cannot be cast to org.apache.hadoop.io.IntWritable at org.apache.mahout.math.hadoop.TransposeJob$TransposeMapper.map(TransposeJob.java:1) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177) 10-Jun-2010 13:25:06 org.apache.hadoop.mapred.JobClient monitorAndPrintJob INFO: map 0% reduce 0% 10-Jun-2010 13:25:06 org.apache.hadoop.mapred.JobClient monitorAndPrintJob INFO: Job complete: job_local_0001 10-Jun-2010 13:25:06 org.apache.hadoop.mapred.Counters log INFO: Counters: 0 Exception in thread "main" java.lang.RuntimeException: java.io.IOException: Job failed! at org.apache.mahout.math.hadoop.DistributedRowMatrix.transpose(DistributedRowMatrix.java:163) at org.apache.mahout.math.hadoop.GenSimMatrixLocal.generateMatrix(GenSimMatrixLocal.java:24) at org.apache.mahout.math.hadoop.GenSimMatrixLocal.main(GenSimMatrixLocal.java:34) Caused by: java.io.IOException: Job failed! at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1252) at org.apache.mahout.math.hadoop.DistributedRowMatrix.transpose(DistributedRowMatrix.java:158) ... 2 more I created a test solr index with 3 documents and generated a sparse feature matrix out of it using mahout's org.apache.mahout.utils.vectors.lucene.Driver. I then ran the following code using the sparse feature matrix as input (mahoutIndexTFIDF.vec). { private void generateMatrix() { String inputPath = "/home/kris/data/mahoutIndexTFIDF.vec"; String tmpPath = "/tmp/matrixMultiplySpace"; int numDocuments = 3; int numTerms = 4; DistributedRowMatrix text = new DistributedRowMatrix(inputPath, tmpPath, numDocuments, numTerms); JobConf conf = new JobConf("similarity job"); text.configure(conf); DistributedRowMatrix transpose = text.transpose(); DistributedRowMatrix similarity = transpose.times(transpose); System.out.println("Similarity matrix lives: " + similarity.getRowPath()); } public static void main (String [] args) { GenSimMatrixLocal similarity = new GenSimMatrixLocal(); similarity.generateMatrix(); } } Anyone see why there is a problem between LongWritable and IntWritable casting? Does it need to be configured differently? Thanks, Kris --00151747b1b87960730488ac2950--