hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lars hofhansl <lhofha...@yahoo.com>
Subject Re: Scan.addFamiliy reduces results
Date Thu, 15 Mar 2012 17:17:56 GMT
Hi haijia,

In that case HBase will still return the data for columns in family B and C.But if you only
added family A then HBase would only return "rows" for which family A has any columns.

-- Lars
________________________________

From: Haijia Zhou <leonster@gmail.com>
To: user@hbase.apache.org; lars hofhansl <lhofhansl@yahoo.com> 
Sent: Thursday, March 15, 2012 10:12 AM
Subject: Re: Scan.addFamiliy reduces results


I have the same confusion. Say if I added three column families A, B anc C to the scan, now
if a row has data for column family B and C but no data for A, then it won't be returned 
in the next() method?
What if the requirement is to get row data regardless of whether there's data for a specific
column family or not?


On Thu, Mar 15, 2012 at 1:04 PM, lars hofhansl <lhofhansl@yahoo.com> wrote:

Hi Peter,
>for HBase you have keep in mind that it is a sparse columnar (or KeyValue) store: (rowkey,
columnfamily, column, TS) -> value
>
>A scan only returns those KeyValues that match the scan. So when you set families on your
scan you'll only get those rows for which the scan found any columns.
>
>Makes sense?
>
>-- Lars
>
>
>
>________________________________
> From: Peter Wolf <opus111@gmail.com>
>To: user@hbase.apache.org
>Sent: Thursday, March 15, 2012 9:52 AM
>Subject: Re: Scan.addFamiliy reduces results
>
>
>Thanks Doug,
>
>I had read that, and I just read it again.  But I am missing something...
>
>Why does adding a family reduce the number of results?  Is there an
>implied filter of some form?  Does addFamily add some constraint on
>which rows are returned?
>
>Note that all my rows *ought* to have values in all the families.
>
>Thanks
>Peter
>
>On 3/15/12 12:39 PM, Doug Meil wrote:
>> re:  "However, I am getting different number of results, depending on
>> which families are added"
>>
>> Yes.
>>
>> I'd suggest you read this in the RefGuide.
>>
>> http://hbase.apache.org/book.html#datamodel
>>
>>
>>
>>
>>
>> On 3/15/12 12:08 PM, "Peter Wolf"<opus111@gmail.com>  wrote:
>>
>>> Hi all,
>>>
>>> I am doing a scan on a table with multiple families.  My code looks like
>>> this...
>>>
>>>          Scan scan = new Scan(calculateStartRowKey(a),
>>> calculateEndRowKey(b));
>>>
>>>          scan.setCaching(10000);
>>>          Filter filter = new SingleColumnValueFilter(xFamily, xColumn,
>>> CompareFilter.CompareOp.EQUAL, Bytes.toBytes(x));
>>>          scan.setFilter(filter);
>>>          scan
>>>                  .addFamily(xFamily)
>>>                  .addFamily(yFamily)
>>>                  .addFamily(zFamily);
>>>
>>>          ResultScanner scanner = hTable.getScanner(scan);
>>>
>>>          Iterator<Result>  it = scanner.iterator();
>>>          int resultCount = 0;
>>>          while (it.hasNext()) {
>>>                Result result = it.next();
>>>
>>>                resultCount++;
>>>          }
>>>
>>> However, I am getting different number of results, depending on which
>>> families are added.  For example these give different result counts
>>>
>>>          scan
>>>                  //.addFamily(xFamily)
>>>                  .addFamily(yFamily)
>>>                  .addFamily(zFamily);
>>> and
>>>          scan
>>>                  .addFamily(xFamily)
>>>                  .addFamily(yFamily)
>>>                  .addFamily(zFamily);
>>>
>>>
>>> There is no error message, and I don't see anything in the Scan
>>> documentation.  Does anyone know what is going on?
>>>
>>> Thanks
>>> Peter
>>>
>>>
>>>
>>

Mime
View raw message