accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Christopher Tubbs (JIRA)" <>
Subject [jira] [Commented] (ACCUMULO-1730) ColumnVisibility parse tree nodes do not have correct location offsets for AND and OR nodes
Date Thu, 10 Oct 2013 21:09:43 GMT


Christopher Tubbs commented on ACCUMULO-1730:

[~ctubbsii] wrote on [GitHub|]:
{quote}Out of curiosity, why is this needed? I'm really hesitant to backport this change,
since it's not really a bug, and because of the risks.{quote}

[~jstoneham] wrote on [GitHub|]:
{quote}I was writing a MapReduce job to re-normalize the visibility strings in our database
because the format we want to use has changed. This is easy enough on most visibility strings
- parse them the old way, then re-write them out the new way. The issue I had was dealing
with complex visibility strings with top-level ORs, such as (A&B)|(C&(D|E)). I wanted
to use the parse tree to find the substrings of the top-level visibilities, since we can't
safely just split on the | character. But the node offsets were wrong.

I understand the risk of the backport, and making a call on whether or not to add it the 1.4
baseline based on that. I'm still not sure why it doesn't "count as a bug", though.{quote}

(Since this is our issue tracker, I moved the conversation here.)

I see. That's an interesting use case. I'd be reluctant to recommend relying on this code
for that case. Originally, this code was written as a state machine. It was transformed to
look more like a parse tree, to make it easier to fix some bugs at the time, but that cost
us some efficiency. We addressed that loss by caching the results of visibility expression
parsing in the visibility filter in the iterator stack with an LRUMap, but we've always considered
this code to be "internal", and subject to change back to a much more efficient state machine
or something else, as needed.

The reason why it's technically an improvement rather than a bug, is because the behavior
you're expecting is based on assumptions about the semantics of an inner class intended for
internal use... assumptions that were incorrect. Until you (effectively) requested those narrower
semantics by filing this JIRA issue, that class existed solely to evaluate visibility expressions,
not to provide a parse tree for users. However, it can do both, and that's why [~ecn] added
it as an improvement in 1.6.

If you think this is a "must have", I can backport it to 1.4.5 and 1.5.1, but I'm unwilling
to accept the maintenance costs/risks for doing so if it's a "might be nice". (Perhaps another
commiter would be willing, though.) I did check your updated pull request, but I couldn't
get it to merge cleanly. There might have been some other updates in the 1.4.5-SNAPSHOT and
1.5.1-SNAPSHOT branches that made the merge problematic.

> ColumnVisibility parse tree nodes do not have correct location offsets for AND and OR
> -------------------------------------------------------------------------------------------
>                 Key: ACCUMULO-1730
>                 URL:
>             Project: Accumulo
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 1.4.0, 1.4.1, 1.4.2, 1.4.3, 1.4.4, 1.5.0
>            Reporter: John Stoneham
>            Assignee: Eric Newton
>            Priority: Trivial
>             Fix For: 1.4.5, 1.5.1, 1.6.0
> Trying to do some transformations on visibility strings and running into issues working
with the parse tree:
> Clojure 1.5.1
> user=> (import [ ColumnVisibility])
> user=> (def vis (ColumnVisibility. "(W)|(U|V)"))
> #'user/vis
> user=> (.getTermStart (first (.getChildren (.getParseTree vis))))
> 1
> user=> (.getTermEnd (first (.getChildren (.getParseTree vis))))
> 2
> user=> (.getTermStart (second (.getChildren (.getParseTree vis))))
> 0
> user=> (.getTermEnd (second (.getChildren (.getParseTree vis))))
> 8
> Shouldn't those last two be 5 and 8?

This message was sent by Atlassian JIRA

View raw message