hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bhupesh Bansal" <bban...@linkedin.com>
Subject RE: can't read the SequenceFile correctly
Date Fri, 06 Feb 2009 16:52:32 GMT
Hey Tom, 

I got also burned by this ?? Why does BytesWritable.getBytes() returns
non-vaild bytes ?? Or we should add a BytesWritable.getValidBytes() kind of function. 


Best
Bhupesh 



-----Original Message-----
From: Tom White [mailto:tom@cloudera.com]
Sent: Fri 2/6/2009 2:25 AM
To: core-user@hadoop.apache.org
Subject: Re: can't read the SequenceFile correctly
 
Hi Mark,

Not all the bytes stored in a BytesWritable object are necessarily
valid. Use BytesWritable#getLength() to determine how much of the
buffer returned by BytesWritable#getBytes() to use.

Tom

On Fri, Feb 6, 2009 at 5:41 AM, Mark Kerzner <markkerzner@gmail.com> wrote:
> Hi,
>
> I have written binary files to a SequenceFile, seemeingly successfully, but
> when I read them back with the code below, after a first few reads I get the
> same number of bytes for the different files. What could go wrong?
>
> Thank you,
> Mark
>
>          reader = new SequenceFile.Reader(fs, path, conf);
>            Writable key = (Writable)
> ReflectionUtils.newInstance(reader.getKeyClass(), conf);
>            Writable value = (Writable)
> ReflectionUtils.newInstance(reader.getValueClass(), conf);
>            long position = reader.getPosition();
>            while (reader.next(key, value)) {
>                String syncSeen = reader.syncSeen() ? "*" : "";
>                byte [] fileBytes = ((BytesWritable) value).getBytes();
>                System.out.printf("[%s%s]\t%s\t%s\n", position, syncSeen,
> key, fileBytes.length);
>                position = reader.getPosition(); // beginning of next record
>            }
>


Mime
  • Unnamed multipart/mixed (inline, None, 0 bytes)
View raw message