hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael Segel <michael_se...@hotmail.com>
Subject RE: Binary Input Files
Date Thu, 10 Mar 2011 04:20:18 GMT

No,

Sorry. I meant that we are using HBase and we have binary files with binary records that we
want to store. So we have the same problem....
So your approach of creating a custom InputFormat is the way to go.

HTH

-Mike

> Subject: Re: Binary Input Files
> From: maha@umail.ucsb.edu
> Date: Wed, 9 Mar 2011 18:21:13 -0800
> To: common-user@hadoop.apache.org
> 
> 
> So you're suggesting that using HBase will be an alternative to creating my own stuff?!!
By the way, why don't you use Binary inputs? do you think it's not gonna have great affect
on performance?
> 
> Thanks Mike.
> 
> On Mar 9, 2011, at 5:27 PM, Michael Segel <michael_segel@hotmail.com> wrote:
> 
> > 
> > 
> > Maha,
> > 
> > I haven't tried streaming, but ingestion of Binary data in to HBase means doing
exactly what you suggest. (Write your own BinaryInputFormat and define your own record splits.)
> > 
> > HTH
> > 
> > -Mike
> > 
> >> From: maha@umail.ucsb.edu
> >> Subject: Binary Input Files
> >> Date: Wed, 9 Mar 2011 16:20:39 -0800
> >> To: common-user@hadoop.apache.org
> >> 
> >> Hello,
> >> 
> >>       I find my question in the Archives http://www.mail-archive.com/core-user@hadoop.apache.org/msg01750.html
> >> 
> >>   which is how to use a my binary files with my specific buffer protocol to
with the InputFormat. 
> >> 
> >>  The answer is suggesting some base64 conversion, which I think eliminate the
benefits of using Binary files. 
> >> 
> >>       If I decided to write my own InputFormat that defines Splits based on
my binary protocol and a recordReader also on my binary protocol. 
> >> 
> >>   Will that interfere with the streaming stuff ? or it is doable ?
> >> 
> >> Thank you,
> >> Maha
> >                         
 		 	   		  
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message