stdcxx-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Travis Vitek <vi...@roguewave.com>
Subject Re: svn commit: r628839 - /stdcxx/trunk/tests/self/0.braceexp.cpp
Date Wed, 20 Feb 2008 01:38:15 GMT



Martin Sebor wrote:
> 
>> 
>> sebor-2 wrote:
>>> +
>>> +    TEST ("foo {1,2} bar", "foo 1 2 bar");
>>> +
>>>
>> 
>> This isn't a brace expansion. It is a literal string, followed by a brace
>> expansion, followed by a literal string. When you run 'echo foo {1,2}
>> bar'
>> in the shell, each of the args are brace expanded individually, so the
>> only
>> thing that is brace expanded is the '{1,2}' and everything else is
>> written
>> literally. I believe this testcase is invalid.
> 
> So rw_brace_expand() only handles a single brace expression? I.e.,
> a pattern like this:
> 
>   pattern  ::= [ <preamble> ] '{' <list> | <seq-exp> '}' [ <postfix>
]
>   list     ::= string [ , <list> ] | string
>   seq-expr ::= <char> .. <char> | <number> .. <number>
> 
> If that's so, what's the definition of <preamble> and <postfix>?
> 
> FWIW, the grammar I suggested here http://tinyurl.com/2rs3he allows
> multiple brace expressions:
> 
>   string     ::= <brace-expr> | [ <chars> ]
>   brace-expr ::= <string> '{' <brace-list> '}' <string> | <string>
>   brace-list ::= <string> ',' <brace-list> | <string>
>   chars      ::= <pcs-char> <string> | <pcs-char>
>   pcs-char   ::= character in the Portable Character Set
> 
> We don't need to follow it but if we choose not to we should document
> what grammar we do follow.
> 

Neither of the definitions you provide above are complete. We implement the
second grammar with the addition of the sequence expression that is shown
above.

The function rw_brace_expand() does brace expansion on whatever string you
pass. There is no special treatment for whitespace [as required by the Bash
Reference Manual]. That is a feature of the shell. The difference is very
significant...

If you pass the string "first {1,2} last" to the shell for expansion, you
will get "first 1 2 last" as output. The shell is brace expanding each of
the three tokens seperately. If you brace expand this as one string, which
you can do by escaping the spaces, you get a different result. You should
get "first 1 last first 2 last".

I think much of the confusion we are having is because we are testing our
rw_brace_expand() function with the bash test suite. Our function only
implements one of the many features of bash, but the test suite verifies
several of the bash features are working together correctly.


Martin Sebor wrote:
> 
>> sebor-2 wrote:
>>> +    // this doesn't work in Bash 3.2
>>> +    // TEST ("{0..10,braces}", "0 1 2 3 4 5 6 7 8 9 10 braces");
>>> +
>>>
>> 
>> I don't know how anyone could expect this to work. The first
>> subexpression
>> of the brace expansion list is '0..10', which itself is not a brace
>> expansion, so it should not be expanded. It should be left as a literal.
>> This happens to be the behavior I see with Bash 3.0.
> 
> Yes, but from the comment above the test case in the braces.test file
> it looks like they plan to make it work. The comments says: # this
> doesn't work yet. I don't think it's an important use case and I have
> no problem with not implementing it (for now).
> 

I wish I could get inside the heads of the guys writing this stuff. By
introducing such a change, they are potentially breaking existing user code,
and there appears to be no net benefit for making such a change. As we've
already established, it is quite easy to stick an additional set of braces
in there.


Martin Sebor wrote:
> 
>> 
>> 
>> sebor-2 wrote:
>>> +    TEST ("{a..A}",
>>> +          "a ` _ ^ ]  [ Z Y X W V U T S R Q P O N M L K J I H G F E D C
>>> B
>>> A");
>>> +    TEST ("{A..a}",
>>> +          "A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [  ] ^ _
>>> `
>>> a");
>>> +
>>>
>> 
>> Interesting. I didn't think it would make sense to allow mixing of lower
>> and
>> uppercase characters in the sequence expression because of the characters
>> between 'Z' and 'a'. Obviously I was wrong. BTW, any idea what happened
>> to
>> ASCII 92? It is the backslash character that should appear between '['
>> and
>> ']'.
> 
> No idea. To me this is one of the most dubious of the features since
> it assumes ASCII. I wouldn't be at all upset if we didn't implement
> it (at least not until we actually need it for something ;-)
> 

Okay, I'll leave it out for now.


Martin Sebor wrote:
> 
>> 
>> 
>> sebor-2 wrote:
>>> +
>>> +    TEST ("0{1..9} {10..20}",
>>> +          "01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19
>>> 20");
>>>
>> 
>> This has the same problem as the first issue I brought up. This is
>> actually
>> two seperate brace expansions, the first is '0{1..9}' and the second is
>> '{10..20}'. This is how the shell handles them, and this is how I handle
>> them.
>> 
>> If they were treated as one brace expansion by the shell, I would expect
>> the
>> postscript '{10..20}' expanded for each prefix/body expansion, much like
>> you
>> would see if you escaped the space.
> 
> I realize those are two brace expressions but I don't understand why
> the function shouldn't be able to handle them as such. That's what
> the expected output assumes, isn't it? I expect this to come up in
> our uses of the feature so if rw_brace_expand() doesn't implement
> it we'll have to implement it somewhere else. Do you see a problem
> with implementing it in rw_brace_expand()?
> 

I could, but I think it would be better to implement something on top of
rw_brace_expand() that expands each of the whitespace seperated tokens
seperately, just like bash does. If rw_brace_expand() becomes a private
implementation function, then so be it.


Martin Sebor wrote:
> 
>> 
>> So, with all that said, I've got a few thoughts.
>> 
>> 1. I don't really like the idea of trying to emulate all behavior of the
>> shell in rw_brace_expand. If we want that, then we should have made a bug
>> entitled 'provide a complete implementation of bash'.
>> 2. I don't feel comfortable trying to maintain compatibility with version
>> 3.2 of bash. It doesn't seem to follow the documented requirements, and I
>> believe that the odd behavior may be difficult to implement. The bash 3.0
>> implementation seems much more sane and that is what I tried to emulate
>> when
>> writing this code.
>> 3. If you, er, we want to do brace expansion exactly like you see within
>> bash, then we should write another function that tokenizes a string on
>> whitespace and does brace expansion on each token. I was expecting the
>> caller of rw_brace_expand() to expect the function to do brace expansion,
>> not complete shell emulation.
> 
> Well, I thought since we were implementing a Bash feature the Bash
> test cases would be useful. If we make a conscious decision to either
> deviate from the Bash behavior or to not implement the features we
> don't expect to use that's okay with me as long as we document what
> our behavior is. One (IMO easy) way to do it is in in the test suite
> in the form of test cases, with an explanation for any differences
> with Bash. I tend to use the test cases for the test driver when I
> need to know how something works.
> 

Unfortunately the bash testcases rely heavily on additional behavior
implemented in bash [environment variable expansion, whitespace collapse,
and more. I understand the motivation, I just want to make sure we all
understand that some of the cases don't work because we're not implementing
bash. We're implementing a single feature of bash.

Travis

-- 
View this message in context: http://www.nabble.com/Re%3A-svn-commit%3A-r628839----stdcxx-trunk-tests-self-0.braceexp.cpp-tp15574776p15580502.html
Sent from the stdcxx-dev mailing list archive at Nabble.com.


Mime
View raw message