hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Naganarasimha G R (Naga)" <garlanaganarasi...@huawei.com>
Subject RE: Is it a bug in CombineFileSplit?
Date Wed, 17 Sep 2014 01:35:18 GMT
Hi Wang,
  Seems like its a defect, are you planning to raise a defect ? if not I can raise and fix



Huawei Technologies Co., Ltd.
Mobile:  +91 9980040283
Email: naganarasimhagr@huawei.com<mailto:naganarasimhagr@huawei.com>
Huawei Technologies Co., Ltd.
Bantian, Longgang District,Shenzhen 518129, P.R.China

From: Benyi Wang [bewang.tech@gmail.com]
Sent: Wednesday, September 17, 2014 06:37
To: user@hadoop.apache.org; common-dev@hadoop.apache.org
Subject: Is it a bug in CombineFileSplit?

I use Spark's SerializableWritable to wrap CombineFileSplit so I can pass around the splits.
But I ran into Serialization issues. In researching why my code fails, I found that this might
be a bug in CombineFileSplit:

CombineFileSplit doesn't serialize locations in write(DataOutput out) and deserialize locations
in readFields(DataInput in).

When I create a split in CombineFileInputFormat, locations is an array of String[0], but after
deserialization (default contructor, then readFields), the locations will be null.

This will lead NPE.

View raw message