hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Demai Ni <nid...@gmail.com>
Subject Re: Scan output to file on each regserver node?
Date Thu, 21 Aug 2014 16:13:02 GMT

Thanks for your kind suggestions. 

I am not sure exactly the use case yet, just doing some experiment. Current idea is to have
a join with data from a mpp database, and have a program from mpp run on each node of Hbase,
so instead of get a collection of all data, the join operation can occur at each regserver
lever. Actually join may not be a good example here. The idea is to access data at regserver
level but still be able to leverage Hbase filters. 

Demai on the run

On Aug 19, 2014, at 7:39 PM, Nick Dimiduk <ndimiduk@gmail.com> wrote:

> A coprocessor is certainly possible. You haven't shared your motivation,
> only a specific implementation, so I cannot assist further.
> On Tue, Aug 19, 2014 at 6:28 PM, Demai Ni <nidmgg@gmail.com> wrote:
>> Nick,
>> Thanks for the quick responds, I will definitely look into the Hadoop
>> streaming.
>> What do you think about AggregationClient? It is carried out at
>> region/region server level, maybe instead do a count/min/avg, a method can
>> be used to write the data out to local file system?
>> Demai on the run
>> On Aug 19, 2014, at 5:04 PM, Nick Dimiduk <ndimiduk@gmail.com> wrote:
>>> This sounds an awful lot like a map-only MR job... With Hadoop Streaming,
>>> you should be able to achieve your goal of piping to an arbitrary
>> process.
>>> On Tue, Aug 19, 2014 at 4:26 PM, Demai Ni <nidmgg@gmail.com> wrote:
>>>> Dear experts ,
>>>> I understand that I can do a simple command like:
>>>> echo "scan 'table1'"| hbase she'll > myoutput
>>>> This scenario i am thinking is to:
>>>> 1) output to local file system(like Linux ) instead of hdfs
>>>> 2) each regserver only output its only data to it's node's file system
>>>> To elaborate the 2) a bit. Basically, this will be like export Hbase
>> data
>>>> to local file system without going through network. And on each node,
>> one
>>>> file will be created.
>>>> Is there a way to achieve it? Actually the receiving side of 1) doesn't
>>>> have to be a file system , it can be another process to process the
>> data.
>>>> But let's use file system to simplify the scenario for now.
>>>> Thanks
>>>> Demai on the run

View raw message