hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From eishay <eis...@gmail.com>
Subject Re: [jira] Created: (HADOOP-3788) Add serialization for Protocol Buffers
Date Thu, 13 Nov 2008 20:01:31 GMT

Sorry to interrupt. 
I had a similar problem and I used this wrapper class so solve it. The
implementation is a bit tricky, but the class usage is simple.

The implementation:
-----------------------------------------------------------------------------
/**
 * Manually serializing Protobuf objects
 * The serialize form first has an integer SIZE which is the size of the
test of the serialized protobuf.
 * After the integer there are SIZE bites of the protobuf serialized object
 * @author esmith
 *
 */
class ProtobufSerializer<T extends GeneratedMessage> implements
Externalizable
{
  /**
   * Object to serialize
   */
  private transient T _proto;
  private transient String _className; 

  public ProtobufSerializer()
  {
  }
  
  public ProtobufSerializer(T proto)
  {
    _proto = proto;
    if(null != _proto)
    {
      _className = _proto.getClass().getName();
    }
  }
  
  public T get()
  {
    return _proto;
  }

  /**
   * If the first byte is the size of zero, the object is null
   * @see java.io.Externalizable#readExternal(java.io.ObjectInput)
   */
  @SuppressWarnings("unchecked")
  public void readExternal (ObjectInput in)
      throws IOException, ClassNotFoundException
  {
    int size = in.readInt();
    if(0 == size)
    {
      return;
    }
    byte[] array = new byte[size];
    in.readFully(array, 0, size);
    _className = new String(array);
    size = in.readInt();
    array = new byte[size];
    in.readFully(array, 0, size);
    try
    {
      Class<?> clazz = getClass().getClassLoader().loadClass(_className);
      Method parseMethod = clazz.getMethod("parseFrom", array.getClass());
      _proto = (T)parseMethod.invoke(clazz, array);
    }
    catch (Exception e)
    {
      e.printStackTrace();
      throw new IOException("could not load class " + _className);
    }
  }

  /**
   * If the the object is null then the int zero is written to the stream   
   * @see java.io.Externalizable#writeExternal(java.io.ObjectOutput)
   */
  public void writeExternal (ObjectOutput out)
      throws IOException
  {
    if(null == _proto)
    {
      out.writeInt(0);
      return;
    }
    out.writeInt(_className.getBytes().length);
    out.write(_className.getBytes());
    ByteArrayOutputStream baos = new ByteArrayOutputStream();
    _proto.writeTo(baos);
    baos.close();
    byte[] array = baos.toByteArray();
    out.writeInt(array.length);
    out.write(array);
  }
}
-----------------------------------------------------------------------------

Using it:
-----------------------------------------------------------------------------
  /**
   * a wrapper that makes sure the media is java serializable
   */
  private ProtobufSerializer<NewsMediaContent> _mediaHolder;

  /**
   * unwraps the object from the serializer
   * @return object media
   */
  public NewsMediaContent getNewsMediaContent ()
  {
    return null == _mediaHolder ? null : _mediaHolder.get();
  }

  /**
   * encapsulates the media in a serializer 
   * @param media
   */
  public void setNewsMediaContent (NewsMediaContent media)
  {
    _mediaHolder = new ProtobufSerializer<NewsMediaContent>(media);
  }
-----------------------------------------------------------------------------

Regards, Eishay



JIRA jira@apache.org wrote:
> 
> Add serialization for Protocol Buffers
> --------------------------------------
> 
>                  Key: HADOOP-3788
>                  URL: https://issues.apache.org/jira/browse/HADOOP-3788
>              Project: Hadoop Core
>           Issue Type: Wish
>           Components: examples, mapred
>             Reporter: Tom White
> 
> 
> Protocol Buffers (http://code.google.com/p/protobuf/) are a way of
> encoding data in a compact binary format. This issue is to write a
> ProtocolBuffersSerialization to support using Protocol Buffers types in
> MapReduce programs, including an example program. This should probably go
> into contrib. 
> 
> -- 
> This message is automatically generated by JIRA.
> -
> You can reply to this email to add a comment to the issue online.
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/-jira--Created%3A-%28HADOOP-3788%29-Add-serialization-for-Protocol-Buffers-tp18526920p20488500.html
Sent from the Hadoop core-dev mailing list archive at Nabble.com.


Mime
View raw message