mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dmitriy Lyubimov (JIRA)" <j...@apache.org>
Subject [jira] [Issue Comment Edited] (MAHOUT-633) Add SequenceFileIterable; put Iterable stuff in one place
Date Wed, 23 Mar 2011 22:41:05 GMT

    [ https://issues.apache.org/jira/browse/MAHOUT-633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13010497#comment-13010497
] 

Dmitriy Lyubimov edited comment on MAHOUT-633 at 3/23/11 10:39 PM:
-------------------------------------------------------------------

if you want to be true to the Hadoop contract, you need to refactor the following to use hadoop's
ReflectionUtils and pass in the configuration. There are tons of writables around that are
also Configurable. Including one of my Mahout's branches that equips VectorWritable with additional
capabilties and controls them by making it Configurable.

{code}
private void instantiateKeyValue() throws IOException {
+    try {
+      key = keyClass.newInstance();
+      if (noValue) {
+        value = null;
+      } else {
+        value = valueClass.newInstance();
+      }
+    } catch (InstantiationException ie) {
+      throw new IOException(ie);
+    } catch (IllegalAccessException iae) {
+      throw new IOException(iae);
+    }
+  }
{code}

      was (Author: dlyubimov):
    if you want to be true to the Hadoop contract, you need to refactor the following use
hadoop's ReflectionUtils and pass in the configuration. There are tons of writables around
that are also Configurable. Including one of my Mahout's branches that equips VectorWritable
with additional capabilties and controls them by making it Configurable.

{code}
private void instantiateKeyValue() throws IOException {
+    try {
+      key = keyClass.newInstance();
+      if (noValue) {
+        value = null;
+      } else {
+        value = valueClass.newInstance();
+      }
+    } catch (InstantiationException ie) {
+      throw new IOException(ie);
+    } catch (IllegalAccessException iae) {
+      throw new IOException(iae);
+    }
+  }
{code}
  
> Add SequenceFileIterable; put Iterable stuff in one place
> ---------------------------------------------------------
>
>                 Key: MAHOUT-633
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-633
>             Project: Mahout
>          Issue Type: Improvement
>          Components: Classification, Clustering, Collaborative Filtering
>    Affects Versions: 0.4
>            Reporter: Sean Owen
>            Assignee: Sean Owen
>            Priority: Minor
>              Labels: iterable, iterator, sequence-file
>             Fix For: 0.5
>
>         Attachments: MAHOUT-633.patch
>
>
> In another project I have a useful little class, SequenceFileIterable, which simplifies
iterating over a sequence file. It's like FileLineIterable. I'd like to add it, then use it
throughout the code. See patch, which for now merely has the proposed new classes. 
> Well it also moves some other iterator-related classes that seemed to be outside their
rightful home in common.iterator.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message