hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John <johnnyenglish...@gmail.com>
Subject Re: RE: Add Columnsize Filter for Scan Operation
Date Fri, 25 Oct 2013 11:45:36 GMT
I try to build a MR-Job, but in my case that doesn't work. Because if I set
for example the batch to 1000 and there are 5000 columns in row. Now i
found to generate something for rows where are the column size is bigger
than 2500. BUT since the map function is executed for every batch-row i
can't say if the row has a size bigger than 2500.

any ideas?


2013/10/25 lars hofhansl <larsh@apache.org>

> We need to finish up HBASE-8369
>
>
>
> ________________________________
>  From: Dhaval Shah <prince_mithibai@yahoo.co.in>
> To: "user@hbase.apache.org" <user@hbase.apache.org>
> Sent: Thursday, October 24, 2013 4:38 PM
> Subject: Re: RE: Add Columnsize Filter for Scan Operation
>
>
> Well that depends on your use case ;)
>
> There are many nuances/code complexities to keep in mind:
> - merging results of various HFiles (each region can have.more than one)
> - merging results of WAL
> - applying delete markers
> - how about data which is only in memory of region servers and no where
> else
> - applying bloom filters for efficiency
> - what about hbase filters?
>
> At some point you would basically start rewriting an hbase region server
> on you map reduce job which is not ideal for maintainability.
>
> Do we ever read MySQL data files directly or issue a SQL query? Kind of
> goes back to the same argument ;)
>
> Sent from Yahoo Mail on Android
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message