hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steinmaurer Thomas" <Thomas.Steinmau...@scch.at>
Subject RE: Writing MR-Job: Something like OracleReducer, JDBCReducer ...
Date Mon, 19 Sep 2011 05:35:09 GMT
Your assumption is correct. As final output, we want to have aggregated
data in an Oracle database. We are using both, the map and reduce phase.
The row key looks like that:

<datasource-id>-<device-id>-<timestamp>

We basically want to have daily aggregated data, basically measured
values for datasource-id/device-id. We already have a proof-of-concept
implementation, what does exactly that, but as final output, aggregated
data is written into a HBase table again by extending the TableReducer
as our reducer implementation.

See also my thread "MR-Job: Exception in DBOutputFormat".

Thanks again!

Thomas

-----Original Message-----
From: Sonal Goyal [mailto:sonalgoyal4@gmail.com] 
Sent: Freitag, 16. September 2011 18:07
To: user@hbase.apache.org
Subject: Re: Writing MR-Job: Something like OracleReducer, JDBCReducer
...

Hi Thomas,

I just assumed that you are already using reducers. From what I
understood, please correct me if I am mistaken,

You have data in HBase and you are running a MR job to aggregate the
data.
You have the map as well as reduce phase and as part of the final
output, you want to send the data to Oracle.  is that correct?

Is there any information you would like to share regarding your flow and
data? How big is your data, how often do you need to aggregate, what do
your mappers emit? Are you already using reducers for aggregations?

Best Regards,
Sonal
Crux: Reporting for HBase <https://github.com/sonalgoyal/crux>
Nube Technologies <http://www.nubetech.co>

<http://in.linkedin.com/in/sonalgoyal>





On Fri, Sep 16, 2011 at 2:35 PM, Michel Segel
<michael_segel@hotmail.com>wrote:

> I think you need to get a little bit more information.
> Reducers are expensive.
> When Thomas says that he is aggregating data, what exactly does he
mean?
> When dealing w HBase, you really don't want to use a reducer.
>
> You may want to run two map jobs and it could be that just dumping the

> output via jdbc makes the most sense.
>
> We are starting to see a lot of questions where the OP isn't providing

> enough information so that the recommendation could be wrong...
>
>
> Sent from a remote device. Please excuse any typos...
>
> Mike Segel
>
> On Sep 16, 2011, at 2:22 AM, Sonal Goyal <sonalgoyal4@gmail.com>
wrote:
>
> > There is a DBOutputFormat class in the 
> > org.apache,hadoop.mapreduce.lib.db
> > package, you could use that. Or you could write to the hdfs and then

> > use something like HIHO[1] to export to the db. I have been working
> extensively
> > in this area, you can write to me directly if you need any help.
> >
> > 1. https://github.com/sonalgoyal/hiho
> >
> > Best Regards,
> > Sonal
> > Crux: Reporting for HBase <https://github.com/sonalgoyal/crux>
> > Nube Technologies <http://www.nubetech.co>
> >
> > <http://in.linkedin.com/in/sonalgoyal>
> >
> >
> >
> >
> >
> > On Fri, Sep 16, 2011 at 10:55 AM, Steinmaurer Thomas < 
> > Thomas.Steinmaurer@scch.at> wrote:
> >
> >> Hello,
> >>
> >>
> >>
> >> writing a MR-Job to process HBase data and store aggregated data in

> >> Oracle. How would you do that in a MR-job?
> >>
> >>
> >>
> >> Currently, for test purposes we write the result into a HBase table

> >> again by using a TableReducer. Is there something like a 
> >> OracleReducer, RelationalReducer, JDBCReducer or whatever? Or 
> >> should one simply use plan JDBC code in the reduce step?
> >>
> >>
> >>
> >> Thanks!
> >>
> >>
> >>
> >> Thomas
> >>
> >>
> >>
> >>
>

Mime
View raw message