accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Newton <>
Subject Re: Question about special characters in row key
Date Tue, 24 Apr 2012 12:47:46 GMT
I don't know if this is your problem, but it is *a* problem I found trying
to demonstrate a scan using the shell and special characters:


On Mon, Apr 23, 2012 at 11:28 PM, Steven Troxell

> So I've dug through how they are ingesting, and found this method:
> /**
>      * Return a scanner pointing at the specified row.
>      *
>      * @param row
>      *            The row we are searching for
>      * @return A scanner pointing at the specified row.
>      * @throws AccumuloException
>      * @throws AccumuloSecurityException
>      * @throws TableNotFoundException
>      */
>     public Scanner getRow(Text row) throws AccumuloException,
>             AccumuloSecurityException, TableNotFoundException {
>         // Create a scanner
>         Scanner scanner = connector
>                 .createScanner(tableName, userAuthorizations);
>         // Find the specified row.
>         scanner.setRange(new Range(row));
>         return scanner;
>     }
> It is generally called along the lines of  scanner = getRow(new
> Text("whatever"))  and the iterated upon.  Is this enough context to
> confirm you may be on the right track here?  To set an end key it I would
> think the last line in that method should be more like
> scanner.setRange(new Range(row), new Range(row))
> Am I correct in my thinking here?
> Regarding the shell, I tried both of your suggestions, to no success.  I'm
> not sure I see where you were going with the truncation, my suspicion is
> it's the quote which is the first character, not the ( casing the problem.
> In any case:
>         scan -b "Journal 1   fails for lack of a closing quote, and when i
> close the quote, I again get the entire set of results.
> Scanning with \x22 leads to a usage error.
> On Mon, Apr 23, 2012 at 10:45 AM, John Vines <>wrote:
>> Sounds like your software isn't setting end keys. If you create a range
>> with just a start, it will go on ad infinitum until you no longer iterate.
>> This is similar to doing a scan using -b without -e.
>> As for why you can't replicate it in your normal scan, it could either be
>> the key not being what you think it is, or just a problem with the way
>> shell handles non alphanumeric characters. One option would be to truncate
>> your scan's start to "Journal 1 and see what you hit first. If you see
>> yourself starting way beyond your "Journal 1 (1940... then we may not be
>> handling quotes well in the shell or your key is not right. At this point,
>> try substituting \x22 for the quotation mark and scanning again.
>> If that still doesn't work, then you may want to dig through your middle
>> projects ingest process to see how it's forming the keys for you.
>> John
>> On Mon, Apr 23, 2012 at 10:20 AM, Steven Troxell <
>>> wrote:
>>> Hi everyone,
>>> I'm attempting to use a beta project designed to integrate an RDF engine
>>> with Accumulo.  There seems to be a bug somewhere in the code that fails to
>>> correctly query accumulo that results in failing to limit the results to
>>> the following sparql query:
>>> SELECT ?yr
>>> WHERE {
>>>   ?journal rdf:type bench:Journal .
>>>   ?journal dc:title "Journal 1 (1940)"^^xsd:string .
>>>   ?journal dcterms:issued ?yr
>>> }
>>> I get results back ranging from 1940-1966, while the Hbase integration
>>> with this particular software correctly just returns 1940.  It's fairly
>>> complicated to explain the entire process of how accumulo scans are spawned
>>> from the above query, but I believe I've narrowed down a possible source of
>>> error that I'd like further leads:
>>> I suspect the developers may not be handling the  quotations correctly
>>> in scanning accumulo.  I say this because this is a sample row from the
>>> accumulo shell:
>>> "Journal 1 (1940)"^^ o:
>>> http://localhost/publications/journals/Journal1/1940
>>> [ROLE1]
>>> From the shell, I have yet to figure out how to successfully scan for
>>> the row key,  just a straight scan -b "Journal 1 (1940)"^^
>>> fails to usage,  wrapping the
>>> rowkey in single quotes seems to return all results, which is what I
>>> suspect happening in the actual software I'm using, as it explains the
>>> behavior I'm seeing.
>>> I'm guessing, but not entirely sure, the developers may have misused the
>>> programatic scans as well on account of not handling the quotations
>>> correctly?  Is this reasonable, and can anyone provide further insight?
>>> Thanks,
>>> Steve

View raw message