From dev-return-3980-archive-asf-public=cust-asf.ponee.io@creadur.apache.org Sat Nov 23 20:33:09 2019 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by mx-eu-01.ponee.io (Postfix) with SMTP id 25559180661 for ; Sat, 23 Nov 2019 21:33:09 +0100 (CET) Received: (qmail 30845 invoked by uid 500); 23 Nov 2019 20:33:02 -0000 Mailing-List: contact dev-help@creadur.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@creadur.apache.org Delivered-To: mailing list dev@creadur.apache.org Received: (qmail 30752 invoked by uid 99); 23 Nov 2019 20:33:01 -0000 Received: from mailrelay1-us-west.apache.org (HELO mailrelay1-us-west.apache.org) (209.188.14.139) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 23 Nov 2019 20:33:01 +0000 Received: from jira-he-de.apache.org (static.172.67.40.188.clients.your-server.de [188.40.67.172]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id D02D2E2C57 for ; Sat, 23 Nov 2019 20:33:00 +0000 (UTC) Received: from jira-he-de.apache.org (localhost.localdomain [127.0.0.1]) by jira-he-de.apache.org (ASF Mail Server at jira-he-de.apache.org) with ESMTP id 13523780479 for ; Sat, 23 Nov 2019 20:33:00 +0000 (UTC) Date: Sat, 23 Nov 2019 20:33:00 +0000 (UTC) From: =?utf-8?Q?Raphael_von_der_Gr=C3=BCn_=28Jira=29?= To: dev@creadur.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (RAT-265) Wildcard file filter do not work anymore MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 Raphael von der Gr=C3=BCn created RAT-265: ---------------------------------------- Summary: Wildcard file filter do not work anymore Key: RAT-265 URL: https://issues.apache.org/jira/browse/RAT-265 Project: Apache Rat Issue Type: Bug Components: cli Affects Versions: 0.13, 0.14 Reporter: Raphael von der Gr=C3=BCn Run the following command in the root of the `rat` repo: =C2=A0 {noformat} java -jar apache-rat-0.14-20191120.132901-66.jar -e "*.txt" -d apache-rat-c= ore/src/test/resources/violations/bad.txt{noformat} This will give the following output on `stderr`: =C2=A0 =C2=A0 {noformat} Will skip given exclusion '*.txt' due to java.util.regex.PatternSyntaxExcep= tion: Dangling meta character '*' near index 0 *.txt ^ {noformat} Furthermore, `bad.txt` will NOT be excluded from the license check. =C2=A0 The error that causes this is thrown in [line 132 of `org.apache.rat.Report= .java`|[https://github.com/apache/creadur-rat/blob/b271d7ff1bfa1b919fe0ed1e= 89d8831b30c42750/apache-rat-core/src/main/java/org/apache/rat/Report.java#L= 132]]. The reason is simple: any glob pattern that starts with `*` or `?` i= s not a valid regex. When Line 132 throws, the next two lines will also be = skipped, so the pattern will not be added at all. Unfortunately, a solution to this problem is not so simple. In `v0.12` the = `-e` option always added wildcard filters while `-E` always added regex fil= ters. The documentation still states the same in the latest `v0.14` snapsho= t. Beginning with `v0.13` the code tries to add any exclude rule as three d= ifferent filters. I believe this approach is inherently flawed. Firstly, the `new NameFileFilter(exclusion)` is redundant if we also add `n= ew WildcardFileFilter(exclusion)`. The files matched by the `NameFileFilter= ` are a subset of those matched by the `WildcardFileFilter` since any=C2=A0= magic character (i.e. `?` or `*`) in `exclusion` also matches itself when u= sed in a `WildcardFileFilter`. So let's assume we only register the=C2=A0`WildcardFileFilter` and the `Reg= exFileFilter`. Even if we properly add patterns as wildcard filters that ar= e not a valid RegEx, there are still patterns where we cannot decide what t= he user's intention was. Consider the pattern `bi.ini`. Should it be interp= reted as a wildcard pattern and match only itself or should it be interpret= ed as a regex and also match `bikini` for example? My recommendation for a quick patch solution would be to go back to the beh= avior of=C2=A0`v0.12`. Beyond that, the nicest solution IMHO would be support for=C2=A0ignore file= s with the same semantics as `.gitignore` (via `-E`) and support for giving= extended shell globs via `-e`. -- This message was sent by Atlassian Jira (v8.3.4#803005)