beam-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <>
Subject [jira] [Commented] (BEAM-1188) More Verifiers For Python E2E Tests
Date Tue, 10 Jan 2017 07:58:58 GMT


ASF GitHub Bot commented on BEAM-1188:

GitHub user markflyhigh opened a pull request:

    [BEAM-1188] Python File Verifer For E2E Tests

    Be sure to do all of the following to help us incorporate your contribution
    quickly and easily:
     - [x] Make sure the PR title is formatted like:
       `[BEAM-<Jira issue #>] Description of pull request`
     - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
           Travis-CI on your fork and ensure the whole test matrix passes).
     - [ ] Replace `<Jira issue #>` in the title with the actual Jira issue
           number, if there is one.
     - [ ] If this contribution is large, please file an Apache
           [Individual Contributor License Agreement](
    Add FileChecksumVerifier to verify E2E test output file(s) locally or on GCS:
     - Refactor TestPipeline to be clean and able to get option value by name
     - Create FileChecksumVerifier with retry when IO failed.
     - Add FileChecksumVerifier to wordcount e2e test
     - Create test_utils to hold utility method for testing.
    Test is done by running wordcount_it against Dataflow service.

You can merge this pull request into a Git repository by running:

    $ git pull file-checksum-verifier

Alternatively you can review and apply these changes as the patch at:

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1756
commit 16109f1651dd9537464687129d8d0c0f3a6a4a91
Author: Mark Liu <>
Date:   2017-01-10T07:48:42Z

    [BEAM-1188] Python File Verifer For E2E Tests


> More Verifiers For Python E2E Tests
> -----------------------------------
>                 Key: BEAM-1188
>                 URL:
>             Project: Beam
>          Issue Type: Task
>          Components: sdk-py, testing
>            Reporter: Mark Liu
>            Assignee: Mark Liu
> Add more basic verifiers in e2e test to verify output data in different storage/fs:
> 1. File verifier: compute and verify checksum of file(s) that’s stored on a filesystem
(GCS / local fs). 
> 2. Bigquery verifier: query from Bigquery table and verify response content. 
> ...
> Also update TestOptions.on_success_matcher to accept a list of matchers instead of single
> Note: Have retry when doing IO to avoid test flacky that may come from inconsistency
of the filesystem. This problem happened in Java integration tests.

This message was sent by Atlassian JIRA

View raw message