From dev-return-3989-archive-asf-public=cust-asf.ponee.io@creadur.apache.org Sat Nov 23 20:50:03 2019 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by mx-eu-01.ponee.io (Postfix) with SMTP id 869A8180661 for ; Sat, 23 Nov 2019 21:50:02 +0100 (CET) Received: (qmail 39943 invoked by uid 500); 23 Nov 2019 20:50:01 -0000 Mailing-List: contact dev-help@creadur.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@creadur.apache.org Delivered-To: mailing list dev@creadur.apache.org Received: (qmail 39855 invoked by uid 99); 23 Nov 2019 20:50:01 -0000 Received: from mailrelay1-us-west.apache.org (HELO mailrelay1-us-west.apache.org) (209.188.14.139) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 23 Nov 2019 20:50:01 +0000 Received: from jira-he-de.apache.org (static.172.67.40.188.clients.your-server.de [188.40.67.172]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 9AD39E2C44 for ; Sat, 23 Nov 2019 20:50:00 +0000 (UTC) Received: from jira-he-de.apache.org (localhost.localdomain [127.0.0.1]) by jira-he-de.apache.org (ASF Mail Server at jira-he-de.apache.org) with ESMTP id 19987780479 for ; Sat, 23 Nov 2019 20:50:00 +0000 (UTC) Date: Sat, 23 Nov 2019 20:50:00 +0000 (UTC) From: "Philipp Ottlinger (Jira)" To: dev@creadur.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (RAT-265) CLI: Certain wildcard file filters do not work anymore MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/RAT-265?page=3Dcom.atlassian.j= ira.plugin.system.issuetabpanels:all-tabpanel ] Philipp Ottlinger updated RAT-265: ---------------------------------- Summary: CLI: Certain wildcard file filters do not work anymore (was: = Certain wildcard file filters do not work anymore) > CLI: Certain wildcard file filters do not work anymore > ------------------------------------------------------ > > Key: RAT-265 > URL: https://issues.apache.org/jira/browse/RAT-265 > Project: Apache Rat > Issue Type: Bug > Components: cli > Affects Versions: 0.13, 0.14 > Reporter: Raphael von der Gr=C3=BCn > Priority: Major > > Run the following command in the root of the `rat` repo: > {noformat} > java -jar apache-rat-0.14-20191120.132901-66.jar -e "*.txt" -d apache-rat= -core/src/test/resources/violations/bad.txt{noformat} > This will give the following output on `stderr`:=C2=A0 > {noformat} > Will skip given exclusion '*.txt' due to java.util.regex.PatternSyntaxExc= eption: Dangling meta character '*' near index 0 > *.txt > ^ > {noformat} > Furthermore, `bad.txt` will NOT be excluded from the license check. > The error that causes this is thrown in [line 132 of `org.apache.rat.Repo= rt.java`|#L132]]. The reason is simple: any glob pattern that starts with `= *` or `?` is not a valid regex. When Line 132 throws, the next two lines wi= ll also be skipped, so the pattern will not be added at all. > Unfortunately, a solution to this problem is not so simple. In `v0.12` th= e `-e` option always added wildcard filters while `-E` always added regex f= ilters. The documentation still states the same in the latest `v0.14` snaps= hot. Beginning with `v0.13` the code tries to add any exclude rule as three= different filters. I believe this approach is inherently flawed. > Firstly, the `new NameFileFilter(exclusion)` is redundant if we also add = `new WildcardFileFilter(exclusion)`. The files matched by the `NameFileFilt= er` are a subset of those matched by the `WildcardFileFilter` since any=C2= =A0magic character (i.e. `?` or `*`) in `exclusion` also matches itself whe= n used in a `WildcardFileFilter`. > So let's assume we only register the=C2=A0`WildcardFileFilter` and the `R= egexFileFilter`. Even if we properly add patterns as wildcard filters that = are not a valid RegEx, there are still patterns where we cannot decide what= the user's intention was. Consider the pattern `bi.ini`. Should it be inte= rpreted as a wildcard pattern and match only itself or should it be interpr= eted as a regex and also match `bikini` for example? > My recommendation for a quick patch solution would be to go back to the e= xclusion behavior of=C2=A0`v0.12`. > Beyond that, the nicest solution IMHO would be support for=C2=A0ignore fi= les with the same semantics as `.gitignore` (via `-E`) and support for givi= ng extended shell globs via `-e`. -- This message was sent by Atlassian Jira (v8.3.4#803005)