hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vikram Singh Chandel <vikramsinghchan...@gmail.com>
Subject Re: How to implement sorting in HBase scans for a particular column
Date Fri, 02 May 2014 20:33:00 GMT
Hi James
Thanks a lot James for the reply,  we will give it a try and let you know
with our progress




On Tue, Apr 29, 2014 at 11:25 PM, James Taylor <jtaylor@salesforce.com>wrote:

> Hi Vikram,
> I see you sent the Phoenix mailing list back in Dec a question on how to
> use Phoenix 2.1.2 with Hadoop 2 for HBase 0.94. Looks like you were having
> trouble building Phoenix with the hadoop2 profile. In our 3.0/4.0 we bundle
> the phoenix jars pre-built with both hadoop1 and hadoop2, so there's
> nothing you need to do.
>
> Did you have any other issues?
>
> Regarding sorting rows, Apache Phoenix handles this for you when you do an
> ORDER BY:
> CREATE TABLE names(id VARCHAR NOT NULL PRIMARY KEY,
>     name VARCHAR, age INTEGER);
> // populate the table
> SELECT * FROM names ORDER BY age;
>
> Thanks,
> James
>
>
> On Tue, Apr 29, 2014 at 5:33 AM, Vikram Singh Chandel <
> vikramsinghchandel@gmail.com> wrote:
>
> > Yes we have looked, but way back in November December 2013 when it was
> > having a lot of issue and because of which we decided not to use it. We
> > built our solution design on Hbase alone. So we are looking for a better
> > solution.
> >
> > Thanks
> >
> >
> > On Tue, Apr 29, 2014 at 5:46 PM, Ted Yu <yuzhihong@gmail.com> wrote:
> >
> > > Have you looked at Apache Phoenix ?
> > >
> > > Cheers
> > >
> > > On Apr 29, 2014, at 2:13 AM, Vikram Singh Chandel <
> > > vikramsinghchandel@gmail.com> wrote:
> > >
> > > > Hi
> > > >
> > > > We have a requirement in which we have to get the scan result sorted
> > on a
> > > > particular column.
> > > >
> > > > eg. *Get Details of Authors sorted by their Publication Count. Limit
> > > :1000 *
> > > >
> > > > *Row Key is a MD5 hash of Author Id*
> > > >
> > > > Number of records 8.2 million rows for 3 year data.(sample dataset,
> > > actual
> > > > data set is 30 years)
> > > >
> > > > We are currently looking in to implement a *comparator *and sort the
> > > > values. But but for this first we have to store all 8.2 m records in
> a
> > > > map/list and then sort. And this approach is neither memory efficient
> > nor
> > > > time efficient.
> > > >
> > > > Is there any solution via which this kind of request can be fulfilled
> > in
> > > > real time?
> > > >
> > > >
> > > >
> > > > --
> > > > *Regards*
> > > >
> > > > *VIKRAM SINGH CHANDEL*
> > > >
> > > > Please do not print this email unless it is absolutely
> > necessary,Reduce.
> > > > Reuse. Recycle. Save our planet.
> > >
> >
> >
> >
> > --
> > *Regards*
> >
> > *VIKRAM SINGH CHANDEL*
> >
> > Please do not print this email unless it is absolutely necessary,Reduce.
> > Reuse. Recycle. Save our planet.
> >
>



-- 
*Regards*

*VIKRAM SINGH CHANDEL*

Please do not print this email unless it is absolutely necessary,Reduce.
Reuse. Recycle. Save our planet.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message