hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jean-Marc Spaggiari <jean-m...@spaggiari.org>
Subject Re: Scan only talks to a single region server
Date Tue, 17 Jul 2012 15:42:06 GMT
Hi,

I'm not 100% sure but I think getScanner return a result scanner and
not the result itself.

What you need to do is something like


  		ResultScanner scanner = table_work_proposed.getScanner(scan);
			Result[] results = scanner.next(linesToRead);
			while (results.length > 0)
			{
				for (Result result : results)
				{
// Do something or nothing
					byte[] row = result.getRow();
				}
				results = scanner.next(linesToRead);
			}

On your example I think you are counting the results scanners. Not the rows.

JM

2012/7/17, Alex Baranau <alex.baranov.v@gmail.com>:
>> this scan is running
>> inside a map task
>
> How do you create your scan(ner)? Could you paste the code here?
>
> You know that when HBase table is used as a source for MapReduce job (via
> standard configuration), each Map task consumes data from one region (apart
> from other things, it tries to benefit from data locality). I.e. it creates
> one Map task per region. I wonder if this can be related.
>
> Sorry for obvious check...
>
> Alex Baranau
> ------
> Sematext :: http://blog.sematext.com/ :: Hadoop - HBase - ElasticSearch -
> Solr
>
> On Tue, Jul 17, 2012 at 11:11 AM, Whitney Sorenson
> <wsorenson@hubspot.com>wrote:
>
>> I'm trying to scan across an entire table (using only a specific
>> family or family + qualifier).
>>
>> I've tried various methods but I can only get this scan to touch the
>> first region server. Afterwords, it stops processing. Issuing the same
>> scan in the shell works (returns 50,000 rows) whereas the Scan made
>> from Java only returns ~4000 rows.
>>
>> I've tried adding/removing start/stop rows, using getScanner(family,
>> column) vs getScanner(scan), and restarting the region servers which
>> host the 1st and 2nd regions.
>>
>> The debug output from the scan shows that it knows about locations for
>> each region; however, it calls close after the first region.
>>
>> In the simplest case, the code looks like:
>>
>> ResultScanner rs = table.getScanner(family, qualifier);
>> for (Result r : rs) {
>> // do something
>> }
>>
>> Any ideas or known issues? (0.90.4-cdh3u2 - this scan is running
>> inside a map task)
>>
>> I figure the next step is to walk through the client scanner code
>> locally in a java main but haven't done this yet.
>>
>
>
>
> --
> Alex Baranau
> ------
> Sematext :: http://blog.sematext.com/ :: Hadoop - HBase - ElasticSearch -
> Solr
>

Mime
View raw message