accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Keith Turner <ke...@deenlo.com>
Subject Re: egrep usage - 1.3.4
Date Mon, 06 Aug 2012 19:41:10 GMT
On Mon, Aug 6, 2012 at 3:13 PM, John Vines <vines@apache.org> wrote:
> Yeah, that was the case I thought of as well. However, I think it would be
> worthwhile to support the improved behavior. Unfortunately, I'm stuck on
> trying to think of a better command for it, since egrep itself is the
> appropriate command and we just have a bit of a misnomer.
>
> I hate this convention, but one option is to introduce egrep2 which is the
> improved behavior, and then put in warning messages informing users that the
> egrep command will be superceded by the egrep2 functionality in the
> following release. Or we could just stick with the two egrep commands in
> perpetuity.

I was mainly thinking of the iterator when thinking of preserving
behavior, because its used by code.   An option could be added to the
RegExFilter to support find().

If you assume that just people use the egrep command in the shell,
then it may be ok to change its behavior because a person could adapt.
 However, this is probably a poor assumption.  I try to think of the
shell as part of the public API.  Scripts could call the egrep
command, and scripts would not automatically adapt to a change in
behavior.  Also this would make it hard to use the same script that
uses egrep against Accumulo 1.4 and 1.5.

Instead of a new command, we could add an option to the egrep command,
like -f.  When the -f option is present it will set the option on the
RegExFilter to use find().

>
> John
>
>
> On Mon, Aug 6, 2012 at 3:01 PM, Michael Flester <flester@gmail.com> wrote:
>>
>>
>>
>> You are right. I had inadvertently constrained my thinking
>> to patterns of the form match(".*{x}.*") == find(".*{x}.*") == find("{x}")
>> but that isn't everything someone
>> might be using it for.
>>
>>
>>
>> On Mon, Aug 6, 2012 at 9:26 AM, Keith Turner <keith@deenlo.com> wrote:
>>>
>>> I was thinking find() will select everything that match() does and
>>> more.  So it may return data that someone used to the current behavior
>>> is not expecting, which could break existing code that uses it.   For
>>> example ".*foo" would select "cfooa" with find() but not with match().
>>>
>>> On Sun, Aug 5, 2012 at 7:16 PM, Michael Flester <flester@gmail.com>
>>> wrote:
>>> > Keith --
>>> >
>>> > Switching from match to find should be no change for anyone that is
>>> > currently using it.
>>> > All patterns that "match" will equally "find". But new users would be
>>> > able
>>> > to take advantage
>>> > of not adding the wildcards on both ends.
>>> >
>>> > Mike
>>> >
>>> >
>>> > On Tue, Jul 31, 2012 at 11:21 AM, Keith Turner <keith@deenlo.com>
>>> > wrote:
>>> >>
>>> >> On Sun, Jul 29, 2012 at 9:47 PM, Michael Flester <flester@gmail.com>
>>> >> wrote:
>>> >> >
>>> >> >
>>> >> > On Sat, Jul 28, 2012 at 7:57 PM, John Vines <vines@apache.org>
>>> >> > wrote:
>>> >> >>
>>> >> >> And when dealing with java, it does full matches, so adding
the .*
>>> >> >> to
>>> >> >> start and end is necessary.
>>> >> >>
>>> >> >
>>> >> > Java has both Matcher#matches and Matcher#find. The latter would
>>> >> > operate
>>> >> > more
>>> >> > like the egrep(1) command without requiring the wildcards on both
>>> >> > ends.
>>> >>
>>> >> Ah, It should have used the find() call when it was first written.
>>> >> Changing it now would be tricky because people who expect the current
>>> >> behavior could get unexpected results.  I think we are kinda stuck
>>> >> with the current behavior.   Could possibly add an option to use
>>> >> find() instead of match().
>>> >
>>> >
>>
>>
>

Mime
View raw message