hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marek Sapota (Commented) (JIRA)" <>
Subject [jira] [Commented] (HIVE-1040) use sed rather than diff for masking out noise in diff-based tests
Date Fri, 02 Dec 2011 05:51:40 GMT


Marek Sapota commented on HIVE-1040:

There is a small problem with masking - some of the tests generate lines in random order,
`-I` completely removes the ignored lines, so

some valid line

some valid line

are the same to `diff -I` but will differ when using plain `diff`.  We could do several things
to make it work:
- use `diff -I masked` (only full line masking then), for some reason it fails, probably because
`diff` claims "some valid line" are the lines not matching and doesn't use the `-I` switch,
man says "Ignore changes whose lines all match RE", anyone knows if this really means make
a diff and remove changes matching RE?  I expected it to be apply `-I` first and then do the
- remove a line if the whole line was masked (masking inside a line possible but would be
hard to tell what was removed from the output file)
- if `diff -I '^masked$'` works it could be a win, but has the same problem as above

For example TestNegativeCliDriver create_view_failure2.q has this problem.

> use sed rather than diff for masking out noise in diff-based tests
> ------------------------------------------------------------------
>                 Key: HIVE-1040
>                 URL:
>             Project: Hive
>          Issue Type: Improvement
>          Components: Testing Infrastructure
>    Affects Versions: 0.4.1
>            Reporter: John Sichi
>            Assignee: Marek Sapota
>            Priority: Minor
> The current diff -I approach has two problems:  (1) it does not allow resolution finer
than line-level, so it's impossible to mask out pattern occurrences within a line, and (2)
it produces unmasked files, so if you run diff on the command line to compare the result .q.out
with the checked-in file, you see the noise.
> My suggestion is to first run sed to replace noise patterns with an unlikely-to-occur
string like ZYZZYZVA, and then diff the pre-masked files without using any -I.
> This would require a one-time hit to update all existing .q.out files so that they would
contain the pre-masked results.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


View raw message