hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Daniel Cryans <jdcry...@apache.org>
Subject Re: Waiting forever on scanner iterator
Date Tue, 20 Oct 2009 20:58:22 GMT
Scanner pre-fetching is always faster, so something must be wrong with
your region server. Check the logs, top, etc

WRT to row size, it's pretty much a matter of how many bytes you have
in each column and sum them up (plus some overhead with the keys).

You want filters, check the filter package in the javadoc.

J-D

On Tue, Oct 20, 2009 at 1:52 PM, Ananth T. Sarathy
<ananth.t.sarathy@gmail.com> wrote:
> Ok, but how come
> when I run a similiar call (with less returned rows 1000 vs 25k in the
> previous one) it runs through the iterator very quickly?  (See Below)
>
> Also, how do I determine the row size? It's just text data, and really not
> much.
>
> Finally, is there a way to query for rows that do not have a column? (Ie all
> rows without Files:path1)
>
>        HBaseTableDataManagerImpl htdmni = new HBaseTableDataManagerImpl(
>                "GS_Applications");
>
>        String[] columns = { "Files:path1" };
>        log.info("Getting all Rows with Files");
>        Scanner s = htdmni.getScannerForAllRows(columns);
>        log.info("Got all Rows with Files");
>
>        Iterator<RowResult> iter = s.iterator();
>        out
>
> .write("Application_Full_Name,Version,Application_installer_name,Operating
> System, Application_platform
> ,Application_sub_category,md5Hash,Sha1Hash,Sha256Hash,filepath,fileName,modified,size,operation\n");
>        out.write("<BR>");
>        while (iter.hasNext())
>        {
>
> Ananth T Sarathy
>
>
> On Tue, Oct 20, 2009 at 4:44 PM, Jean-Daniel Cryans <jdcryans@apache.org>wrote:
>
>> If you have a very slow data source (S3), then it fetches 100 row
>> before coming back to your client with all of them and that can take a
>> lot of time. Also make sure that 100 of your rows can fit in a region
>> server's memory. How big is each row?
>>
>> J-D
>>
>> On Tue, Oct 20, 2009 at 1:32 PM, Ananth T. Sarathy
>> <ananth.t.sarathy@gmail.com> wrote:
>> > I am running this code where
>> >
>> > getScannerForAllRows(columns) just does return table.getScanner(columns);
>> >
>> > and the table   has setScannerCaching(100);
>> >
>> > But it spins forever after getting the iterator. Why would that be? How
>> can
>> > I speed it up?
>> >
>> >        HBaseTableDataManagerImpl htdmni = new HBaseTableDataManagerImpl(
>> >                "GS_Applications");
>> >
>> >        String[] columns = { "Files:Name" };
>> >        log.info("Getting all Rows with Files");
>> >        Scanner s = htdmni.getScannerForAllRows(columns);
>> >        log.info("Got all Rows with Files");
>> >        log.info("Getting Iterator");
>> >
>> >        Iterator<RowResult> iter = s.iterator();
>> >        log.info("Got Iterator");
>> >
>> >        while (iter.hasNext())
>> >        {
>> >            log.info("Getting next Row");
>> >            RowResult rr = iter.next();
>> >
>> >
>> > Ananth T Sarathy
>> >
>>
>

Mime
View raw message