accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Keith Turner (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-164) Add support for wildcards/regexes in locality group setting.
Date Sat, 10 Mar 2012 00:52:57 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13226654#comment-13226654
] 

Keith Turner commented on ACCUMULO-164:
---------------------------------------

John made the comment offline that determining if a set of patterns matches disjoint sets
of column families may not be possible.  I think this is may be true for regular expressions.
  However, it may be easy to determine this automatically with limited wildcarding.   

If only prefix wildcards were allowed, it seems like the following algorithm would ensure
they are disjoint.

{noformat}
  boolean isDisjoint(Set<String> prefixes){
     while(prefixes.size() > 1){
       String shortestPrefix = removeShortestString(prefixes);
       for(String prefix : prefixes){
         if(prefix.startsWith(shortestPrefix)){
           return false;
         }
       }
     }
     return true;
  }
{noformat}

Does this seem correct? For suffixes, startsWith() would be replaced with endsWith().  So
maybe we can handle all prefix wildcards or all suffix wildcards.  Can we verify anything
else is disjoint?  I do not think so.

The following wildcards could match overlapping sets.

{noformat}
  *a*
  *b*
{noformat}

And so could the following.

{noformat}
  foo*
  *bar
{noformat}

So even though the literal parts of the above wildcards are unique, they can still match overlapping
data. 
 


                
> Add support for wildcards/regexes in locality group setting.
> ------------------------------------------------------------
>
>                 Key: ACCUMULO-164
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-164
>             Project: Accumulo
>          Issue Type: Improvement
>          Components: client, master, tserver
>            Reporter: John Vines
>
> We should look into adding the ability to specify locality group columns as either wildcarding
or regexes. I'm unsure of the feasibility of this, hence the lack of fix date.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message