yetus-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Peter Vary (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YETUS-488) Checkstyle reports new error if the file still longer than expected
Date Thu, 16 Feb 2017 15:06:41 GMT

    [ https://issues.apache.org/jira/browse/YETUS-488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15870107#comment-15870107
] 

Peter Vary commented on YETUS-488:
----------------------------------

Thanks guys for the input!

My impression was, that I was removing the numbers only from the error message text.

Could you please help me where I miss something?
We get the checkstyle output, remove the unchanged files from it, and after that add context
to it by adding the code line to the message:
{code:title=checkstyle_runner}
      # file:linenum:(column:)error    ====>
      # file:linenum:code(:column):error
      pushd "${BASEDIR}" >/dev/null
      while read -r logline; do
        file=$(echo "${logline}" | cut -f1 -d:)
        linenum=$(echo "${logline}" | cut -f2 -d:)
        text=$(echo "${logline}" | cut -f3- -d:)
        codeline=$(head -n "+${linenum}" "${file}" | tail -1 )
        echo "${file}:${linenum}:${codeline}:${text}" >> "${output}"
      done < <(cat "${tmp}.1")
{code}

I think this creates the following change:
{code:title=From}
./beeline/src/java/org/apache/hive/beeline/BeeLine.java:1: warning: File length is 2,372 lines
(max allowed is 2,000).
./beeline/src/java/org/apache/hive/beeline/BeeLine.java:109: warning: Comment matches to-do
format 'TODO:'.
./beeline/src/java/org/apache/hive/beeline/BeeLine.java:124:39: warning: Name 'resourceBundle'
must match pattern '^[A-Z][A-Z0-9]*(_[A-Z0-9]+)*$'.
{code}

{code:title=To}
./beeline/src/java/org/apache/hive/beeline/BeeLine.java:1:/**: warning: File length is 2,372
lines (max allowed is 2,000).
./beeline/src/java/org/apache/hive/beeline/BeeLine.java:109: * TODO:: warning: Comment matches
to-do format 'TODO:'.
./beeline/src/java/org/apache/hive/beeline/BeeLine.java:124:  private static final ResourceBundle
resourceBundle =:39: warning: Name 'resourceBundle' must match pattern '^[A-Z][A-Z0-9]*(_[A-Z0-9]+)*$'.
{code}

Ad when calculating the diff we remove the filename, and the line number from the files with
the following code:
{code:title=checkstyle_calcdiffs - First step}
  # first, strip filenames:line:
  # this keeps column: in an attempt to increase
  # accuracy in case of multiple, repeated errors
  # since the column number shouldn't change
  # if the line of code hasn't been touched
  cut -f3- -d: "${orig}" > "${tmp}.branch"
  cut -f3- -d: "${new}" > "${tmp}.patch"
{code}

This changes the files like this:
{code:title=From}
./beeline/src/java/org/apache/hive/beeline/BeeLine.java:1:/**: warning: File length is 2,372
lines (max allowed is 2,000).
./beeline/src/java/org/apache/hive/beeline/BeeLine.java:109: * TODO:: warning: Comment matches
to-do format 'TODO:'.
./beeline/src/java/org/apache/hive/beeline/BeeLine.java:124:  private static final ResourceBundle
resourceBundle =:39: warning: Name 'resourceBundle' must match pattern '^[A-Z][A-Z0-9]*(_[A-Z0-9]+)*$'.
{code}

{code:title=To}
/**: warning: File length is 2,372 lines (max allowed is 2,000).
 * TODO:: warning: Comment matches to-do format 'TODO:'.
  private static final ResourceBundle resourceBundle =:39: warning: Name 'resourceBundle'
must match pattern '^[A-Z][A-Z0-9]*(_[A-Z0-9]+)*$'.
{code}

Note, that after this change the line number is removed - only the position in the file  (line
number) will correspond to the original checkstyle error.

And then calculate the changed line numbers with the following code:
{code:title=checkstyle_calcdiffs - Second step}
  # compare the errors, generating a string of line
  # numbers. Sorry portability: GNU diff makes this too easy
  ${DIFF} --unchanged-line-format="" \
     --old-line-format="" \
     --new-line-format="%dn " \
     "${tmp}.branch" \
     "${tmp}.patch" > "${tmp}.lined"
{code}

And get the original lines using the line numbers:
{code:title=checkstyle_calcdiffs - Third step}
  # now, pull out those lines of the raw output
  # shellcheck disable=SC2013
  for j in $(cat "${tmp}.lined"); do
    # shellcheck disable=SC2086
    head -${j} "${new}" | tail -1
  done
{code}

What I have proposed is removing the numbers from the stripped files like this:
{code:title=checkstyle_calcdiffs - First step - numbers removed}
  cut -f3- -d: "${orig}" | sed -e "s/[0-9,]//g" > "${tmp}.branch"     <-- sed added
here
  cut -f3- -d: "${new}" | sed -e "s/[0-9,]//g" > "${tmp}.patch"     <-- sed added here
{code}

This changes the files like this:
{code:title=From}
./beeline/src/java/org/apache/hive/beeline/BeeLine.java:1:/**: warning: File length is 2,372
lines (max allowed is 2,000).
./beeline/src/java/org/apache/hive/beeline/BeeLine.java:109: * TODO:: warning: Comment matches
to-do format 'TODO:'.
./beeline/src/java/org/apache/hive/beeline/BeeLine.java:124:  private static final ResourceBundle
resourceBundle =:39: warning: Name 'resourceBundle' must match pattern '^[A-Z][A-Z0-9]*(_[A-Z0-9]+)*$'.
{code}

{code:title=To}
/**: warning: File length is  lines (max allowed is ).
 * TODO:: warning: Comment matches to-do format 'TODO:'.
  private static final ResourceBundle resourceBundle =:: warning: Name 'resourceBundle' must
match pattern '^[A-Z][A-Z-]*(_[A-Z-]+)*$'.
{code}

This removes the numbers from the error messages, *and* the checkstyle column number (which
I was not aware of before your comments).
I think this patch will not mess with the original (and really complicated) method of calculating
the errors.
What do you think?

Thanks again to spending time to take a look at this stuff!
Peter

> Checkstyle reports new error if the file still longer than expected
> -------------------------------------------------------------------
>
>                 Key: YETUS-488
>                 URL: https://issues.apache.org/jira/browse/YETUS-488
>             Project: Yetus
>          Issue Type: Improvement
>          Components: Test Patch
>    Affects Versions: 0.4.0
>            Reporter: Peter Vary
>            Assignee: Peter Vary
>            Priority: Minor
>         Attachments: YETUS-488.00.patch
>
>
> Given a java source file which is originally longer than the checkstyle defined value
then we get the following warning every time the length changes and we run the checkstyle
plugin:
> {code}
> ./beeline/src/java/org/apache/hive/beeline/BeeLine.java:1: warning: File length is 2,373
lines (max allowed is 2,000).
> {code}
> The problem is, that the checkstyle output is changed:
> {code:title=From}
> ./beeline/src/java/org/apache/hive/beeline/BeeLine.java:1: warning: File length is 2,373
lines (max allowed is 2,000).
> {code}
> {code:title=To}
> ./beeline/src/java/org/apache/hive/beeline/BeeLine.java:1: warning: File length is 2,365
lines (max allowed is 2,000).
> {code}
> Since the {{checkstyle_calcdiffs}} does not find the new line in the original output,
it marks it as a new error, and gives -1 to the result of the checkstyle plugin.
> This is true for every error message where there is a number in the text. Not exhaustive
examples are below:
> - Line length
> - Row length
> - Function length
> - Number of attributes
> - Indentation
> I think we should be reluctant to give -1 without a reason so it would be good to remove
these errors.
> I have yet to find the ideal solution for the issue:
> - An easy patch could be to remove the numbers from files before the diff (note: it does
not effect the final diff-checkstyle-<MODULE>.txt output messages). This would mean
that the warning will give -1 only the first time when the error text is occurred, and will
not give -1 if the numbers are changed, like
> -- File become shorter, but still longer than expected
> -- File become even longer
> -- Indentation changed but still problematic
> - A more sophisticated patch could do this only for specific errors, or even check the
value and give error when the line length grow.
> To be honest, I think the second solution would complicate the code too much, and make
it dependent on the specific checkstyle messages, so I do not like it.
> I am even hesitant about the first one, because the change I introducing with it might
have unwanted consequences.
> Do anyone has better ideas? I would be happy to implement them, if someone points me
to the right direction :D
> Thanks,
> Peter



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message