kudu-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Mora <jhnmora...@gmail.com>
Subject Re: Empy RowResultIterator with RangePartitions
Date Thu, 11 Jul 2019 20:49:53 GMT
Hi Grant Henke

Thanks so much, I was not aware of that behavior. I have solved my problem
by checking nextRows() in a loop as you suggest.
Also, I will give a look to the Jira issue.

Thanks,
John

El jue., 11 jul. 2019 a las 13:20, Grant Henke (<ghenke@cloudera.com>)
escribió:

> I created a jira to improve the Javadoc for the `nextRows` API here:
> https://issues.apache.org/jira/browse/KUDU-2891
>
> If you are interested in contributing it would be a super simple
> contribution.
>
> On Thu, Jul 11, 2019 at 1:00 PM Grant Henke <ghenke@cloudera.com> wrote:
>
>> Hi John,
>>
>> If you can leverage the newly released Kudu 1.10.0 client, the
>> KuduScanner in the Java client is now iterable. Additionally the
>> KuduScannerIterator will automatically make scanner keep alive calls to
>> ensure scanners do not time out while iterating. This means the you can use
>> a Java for each loop and all the details are handled:
>>
>> *for (RowResult row : scanner) {*
>>
>> *   .... }*
>>
>> If you can't use Kudu 1.10.0, it is expected that `nextRows` could be
>> empty and you need to keep calling it until `scanner.hasMoreRows()` is
>> empty. This often looks something like:
>>
>> while (scanner.hasMoreRows()) {
>>    for (RowResult result : scanner.nextRows()) {
>>         ...
>>     }
>> }
>>
>>
>> Please keep us update on your project and progress, it looks very
>> interesting!
>>
>> Thanks,
>> Grant
>>
>> On Thu, Jul 11, 2019 at 11:00 AM John Mora <jhnmora000@gmail.com> wrote:
>>
>>> Hi all.
>>>
>>> I am John Mora, a GSoC student that is working with the Apache Gora
>>> Community in order to implement a Kudu DataStore for Gora.
>>>
>>> Currently, I am having some issues with KuduScanner, so please could you
>>> give some ideas of what I am doing wrong.
>>>
>>> I am using kudu-client for java [1] and testing my code with
>>> KuduTestHarness [2].
>>>
>>> My code looks like this.
>>>
>>> List<ColumnSchema> columns = new ArrayList<>();
>>> columns.add(new ColumnSchema.ColumnSchemaBuilder("pkurl",
>>> Type.STRING).key(true).build());
>>> columns.add(new ColumnSchema.ColumnSchemaBuilder("content",
>>> Type.BINARY).nullable(true).build());
>>> columns.add(new ColumnSchema.ColumnSchemaBuilder("parsedContent",
>>> Type.STRING).nullable(true).build());
>>>
>>> List<String> keys = new ArrayList<>();
>>> keys.add("pkurl");
>>>
>>> Schema sch = new Schema(columns);
>>> CreateTableOptions cto = new CreateTableOptions();
>>> cto.setRangePartitionColumns(keys);
>>>
>>> PartialRow lowerPar1 = sch.newPartialRow();
>>> PartialRow upperPar1 = sch.newPartialRow();
>>>
>>> upperPar1.addString("pkurl", "http://bar.com/");
>>> cto.addRangePartition(lowerPar1, upperPar1);
>>>
>>> PartialRow lowerPar2 = sch.newPartialRow();
>>> PartialRow upperPar2 = sch.newPartialRow();
>>>
>>> lowerPar2.addString("pkurl", "http://bar.com/");
>>> cto.addRangePartition(lowerPar2, upperPar2);
>>>
>>>
>>> table = client.createTable(kuduMapping.getTableName(), sch, cto);
>>>
>>> // Insert some data using table.newInsert();
>>> // {pkurl:"http://foo.com/1.html", content:[...], parsedContent:[..]}
>>> // {pkurl:"http://baz.com/1.jsp&q=barbaz", content:[...],
>>> parsedContent:[..]}
>>> // {pkurl:"http://baz.com/1.jsp&q=barbaz&p=foo", content:[...],
>>> parsedContent:[..]}
>>>
>>> //Scanner
>>> KuduScanner.KuduScannerBuilder scannerBuilder =
>>> client.newScannerBuilder(table);
>>> List<String> dbFields = new ArrayList<>();
>>> dbFields.add("pkurl");
>>> dbFields.add("content");
>>> dbFields.add("parsedContent");
>>> scannerBuilder.setProjectedColumnNames(dbFields);
>>> KuduScanner build = scannerBuilder.build();
>>> RowResultIterator resultIt = build.nextRows();
>>> //Actual: RowResultIterator is Empty
>>> //Expected: RowResultIterator has 3 entries.
>>>
>>> I tested the same code with cto.addHashPartitions(keys, 2); instead of
>>> addRangePartition.
>>> And it works fine.
>>>
>>> Why do I get an empty result when using addRangePartition? .
>>>
>>>
>>>
>>> Cheers,
>>> John
>>>
>>> [1] https://kudu.apache.org/docs/developing.html#_maven_artifacts
>>> [2]
>>> https://kudu.apache.org/docs/developing.html#_jvm_based_integration_testing
>>>
>>
>>
>> --
>> Grant Henke
>> Software Engineer | Cloudera
>> grant@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke
>>
>
>
> --
> Grant Henke
> Software Engineer | Cloudera
> grant@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke
>

Mime
View raw message