hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benyi Wang <bewang.t...@gmail.com>
Subject Is it a bug in CombineFileSplit?
Date Tue, 16 Sep 2014 22:37:42 GMT
I use Spark's SerializableWritable to wrap CombineFileSplit so I can pass
around the splits. But I ran into Serialization issues. In researching why
my code fails, I found that this might be a bug in CombineFileSplit:

CombineFileSplit doesn't serialize locations in write(DataOutput out) and
deserialize locations in readFields(DataInput in).

When I create a split in CombineFileInputFormat, locations is an array of
String[0], but after deserialization (default contructor, then readFields),
the locations will be null.

This will lead NPE.

View raw message