hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From stack <st...@duboce.net>
Subject Re: map reduce range of records from hbase table
Date Thu, 09 Oct 2008 05:10:52 GMT
On Wed, Oct 8, 2008 at 9:01 PM, Jaeyun Noh <metalain@gmail.com> wrote:

> Thx.
>
> BTW, it seems that the output format (subclass of
> org.apache.hadoop.mapred.OutputFormat) of MR job can only be a file. Can we
> define our own file format which hbase clients can access?


No.  You can output to anything as long as you make it implement
OutputFormat.  To output to hbase subclass TableReduce or see
TableOutputFormat.


>
>
> My goal is to implement filter-enabled table scanner which runs by
> multi-process clients using MR. I'm trying to leverage MR since the
> ClientScanner class of HTable sequencially access Hregion and thus involves
> multiple round trips btw servers and clients.


I'm not sure I follow.  Perhaps start simple then see where the bottlenecks
are and optimize here.  Regards roundtrips between client and server, what
you want? A scanner that returns batches rather than row at at time?

St.Ack





>
>
> On Wed, Oct 8, 2008 at 4:30 PM, stack <stack@duboce.net> wrote:
>
> > Jaeyun Noh wrote:
> >
> >> Hi,
> >>
> >> May I ask another question?
> >>
> >> I'm running HBase/Hadoop on linux server, and implementing business
> >> application with java, which runs on a different windows machine.
> >>  It looks like MapReduce job runs on a server node. Can I run the
> >> MapReduce
> >> job built on windows client with an existing linux server? How can we
> get
> >> result done by MapReduce job at the server?
> >>
> >>
> >
> > You should be able to, yes.  Make sure you use same java on both
> machines.
> >  This page might help some:
> http://wiki.apache.org/hadoop/Hbase/MapReduce.
> > St.Ack
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message