orc-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Philipp Blum (JIRA)" <j...@apache.org>
Subject [jira] [Created] (ORC-168) Documentation Writer Example: batch.size++ has a wrong possition
Date Sat, 25 Mar 2017 21:51:41 GMT
Philipp Blum created ORC-168:
--------------------------------

             Summary: Documentation Writer Example: batch.size++ has a wrong possition
                 Key: ORC-168
                 URL: https://issues.apache.org/jira/browse/ORC-168
             Project: ORC
          Issue Type: Improvement
          Components: documentation, Java
            Reporter: Philipp Blum
            Priority: Critical


There's one little mistake in the Java Core Example. The for loops starts with a batch.size++:

for(int r=0; r < 10000; ++r) {
  int row = batch.size++;
  x.vector[row] = r;
  y.vector[row] = r * 3;
  // If the batch is full, write it out and start over.
  if (batch.size == batch.getMaxSize()) {
    writer.addRowBatch(batch);
    batch.reset();
  }
}

If you start with a batch.size++ the first index will be 1, so the first entry in the orc
file will be empty.
Correct is:

for(int r=0; r < 10000; ++r) {
  x.vector[row] = r;
  y.vector[row] = r * 3;
  int row = batch.size++;
  // If the batch is full, write it out and start over.
  if (batch.size == batch.getMaxSize()) {
    writer.addRowBatch(batch);
    batch.reset();
  }
}

Already tested it in scala.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message