ant-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From George Bills <gbi...@funnelback.com>
Subject Re: containsregex and concat
Date Mon, 27 Nov 2006 05:41:07 GMT
Hrm, it probably isn't since advanced regexs are still black magic to 
me. The "." was supposed to match any character, including a newline 
(with the s flag), the * to say match 0-n of them and the ? to say be 
lazy, match as little as possible (so that I don't pull in 
<table>...</table><table>...</table> in one match).

I just tried [^<], but it doesn't seem to work - I think because of such 
things as "<table><tr>...</tr></table>" - the opening bracket of <tr>

conflicts. I tried [.&lt;&gt]*? to make sure that the "regex.body" part 
was matching the brackets, but that didn't work either.

Also, <table class="summary"> was wrong - <table class="summary"(.*?)> 
is a little better since the tables can have more than the class 
attribute (in fact, all of them do). But after changing that I'm 
matching the entire document - <html> through to </html>. That might 
just be because I'm using filetokenizer - if I make one match within 
filetokenizer, do I end up getting the entire document? If so, how do I 
get only the matching text?

Regex is now: <table class="summary".*?>.*?</table>

Thanks for the help, I appreciate it.

Dave Brosius wrote:
> .*?
>
> doesn't seem right to me.
>
> what's that's suppposed to do?
>
> probably something like [^<]*
>
>
>
> ----- Original Message ----- From: "George Bills" <gbills@funnelback.com>
> To: <user@ant.apache.org>
> Sent: Sunday, November 26, 2006 11:47 PM
> Subject: containsregex and concat
>
>
>> I've been trying to use a regular expression and the concat task to 
>> pull summary tables (<table class="summary">...</table>) out of a set

>> of test reports. The reports are all HTML files sitting in 
>> ${report.path}. The task works fine up until I start trying to select 
>> output from it with <containsregex>. Is there something wrong with my 
>> regular expression? Is there an easier way to do this? Any help would 
>> be appreciated.
>>
>> The code is:
>> ====================
>> <target name="summary"> <!-- make a report summary -->
>>    <property name="summary.start" value="&lt;table 
>> class=&quot;summary&quot;&gt;" />
>>    <property name="summary.body"  value=".*?" /> <!-- enable "s" for 
>> newline matches -->
>>    <property name="summary.end"   value="&lt;/table&gt;" />
>>    <property name="summary.regex" 
>> value="${summary.start}${summary.body}${summary.end}" />
>>    <echo>${summary.regex}</echo>
>>    <concat>
>>        <header>HEADER</header>
>>        <fileset dir="${report.path}"
>>            includes="*.html"
>>            excludes="${summary.file}" />
>>            <filterchain>
>>                <tokenfilter>
>>                    <filetokenizer />
>>                    <containsregex flags="is"
>>                                   pattern="${summary.regex}" />
>>                </tokenfilter>
>>            </filterchain>
>>        <footer>FOOTER</footer>
>>    </concat>
>> </target>
>> ====================
>>
>> The regular expression echoes as:
>> ====================
>> <table class="summary">.*?</table>
>> ====================
>>
>> I've done some testing of the expression at 
>> http://www.fileformat.info/tool/regex.htm, and it seems to work there.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@ant.apache.org
>> For additional commands, e-mail: user-help@ant.apache.org
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@ant.apache.org
> For additional commands, e-mail: user-help@ant.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@ant.apache.org
For additional commands, e-mail: user-help@ant.apache.org


Mime
View raw message