hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jim Twensky <jim.twen...@gmail.com>
Subject Wrapping around BitSet with the Writable interface
Date Sun, 12 May 2013 18:24:48 GMT
I have large java.util.BitSet objects that I want to bitwise-OR using a
MapReduce job. I decided to wrap around each object using the Writable
interface. Right now I convert each BitSet to a byte array and serialize
the byte array on disk.

Converting them to byte arrays is a bit inefficient but I could not find a
work around to write them directly to the DataOutput. Is there a way to
skip this and serialize the object directly? Here is what my current
implementation looks like:

public class BitSetWritable implements Writable {

  private BitSet bs;

  public BitSetWritable() {
    this.bs = new BitSet();
  }

  @Override
  public void write(DataOutput out) throws IOException {

    ByteArrayOutputStream bos = new ByteArrayOutputStream(bs.size()/8);
    ObjectOutputStream oos = new ObjectOutputStream(bos);
    oos.writeObject(bs);
    byte[] bytes = bos.toByteArray();
    oos.close();
    out.writeInt(bytes.length);
    out.write(bytes);

  }

  @Override
  public void readFields(DataInput in) throws IOException {

    int len = in.readInt();
    byte[] bytes = new byte[len];
    in.readFully(bytes);

    ByteArrayInputStream bis = new ByteArrayInputStream(bytes);
    ObjectInputStream ois = new ObjectInputStream(bis);
    try {
      bs = (BitSet) ois.readObject();
    } catch (ClassNotFoundException e) {
      throw new IOException(e);
    }

    ois.close();
  }

}

Mime
View raw message