kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mayuresh Gharat" <gharatmayures...@gmail.com>
Subject Re: Review Request 27534: Patch for KAFKA-1746
Date Tue, 11 Nov 2014 03:17:47 GMT


> On Nov. 8, 2014, 3:01 a.m., Mayuresh Gharat wrote:
> > system_test/utils/kafka_system_test_utils.py, line 2403
> > <https://reviews.apache.org/r/27534/diff/1/?file=747724#file747724line2403>
> >
> >     This causes many system tests (approx 98) to fail
> 
> Ewen Cheslack-Postava wrote:
>     Can you give list of a few that now fail for you? Depending on the tests, they *should*
start failing -- unless the failure is caused by something other than DumpLogSegments returning
a negative return code, the new failures indicate that the tests weren't properly checking
DumpLogSegments output. I originally noticed this because I saw DumpLogSegments not catching
an exception and exiting improperly, but the subsequent checks of stdout didn't properly check
for that (and the exit code is a *much* more reliable test). The system tests seem to never
check subprocess exit statuses, so there are likely many other similar bugs.
>     
>     If it's something else (e.g. some issue with the subprocess wait() call), I can try
to track it down if I know which test to look at.
> 
> Mayuresh Gharat wrote:
>     These are some of the test that are failing. I am pasting the console output for
them :
>     
>     _test_case_name  :  testcase_0008
>     _test_class_name  :  ReplicaBasicTest
>     arg : bounce_broker  :  false
>     arg : broker_type  :  leader
>     arg : message_producing_free_time_sec  :  15
>     arg : num_iteration  :  1
>     arg : num_messages_to_produce_per_producer_call  :  50
>     arg : num_partition  :  3
>     arg : replica_factor  :  3
>     arg : sleep_seconds_between_producer_calls  :  1
>     validation_status  : 
>          No. of messages from consumer on [test_1] at simple_consumer_test_1-0_r1.log
 :  800
>          No. of messages from consumer on [test_1] at simple_consumer_test_1-0_r2.log
 :  800
>          No. of messages from consumer on [test_1] at simple_consumer_test_1-0_r3.log
 :  800
>          No. of messages from consumer on [test_1] at simple_consumer_test_1-1_r1.log
 :  960
>          No. of messages from consumer on [test_1] at simple_consumer_test_1-1_r2.log
 :  960
>          No. of messages from consumer on [test_1] at simple_consumer_test_1-1_r3.log
 :  960
>          No. of messages from consumer on [test_1] at simple_consumer_test_1-2_r1.log
 :  1440
>          No. of messages from consumer on [test_1] at simple_consumer_test_1-2_r2.log
 :  1440
>          No. of messages from consumer on [test_1] at simple_consumer_test_1-2_r3.log
 :  1440
>          Unique messages from consumer on [test_1]  :  3200
>          Unique messages from producer on [test_1]  :  3200
>          Validate for data matched on topic [test_1]  :  PASSED
>          Validate for data matched on topic [test_1] across replicas  :  PASSED
>          Validate for merged log segment checksum in cluster [source]  :  PASSED
>          Validate index log in cluster [source]  :  FAILED
>     
>     ========================================================
>     
>     _test_case_name  :  testcase_0009
>     _test_class_name  :  ReplicaBasicTest
>     arg : bounce_broker  :  false
>     arg : broker_type  :  leader
>     arg : message_producing_free_time_sec  :  15
>     arg : num_iteration  :  1
>     arg : num_messages_to_produce_per_producer_call  :  50
>     arg : num_partition  :  3
>     arg : replica_factor  :  3
>     arg : sleep_seconds_between_producer_calls  :  1
>     validation_status  : 
>          No. of messages from consumer on [test_1] at simple_consumer_test_1-0_r1.log
 :  775
>          No. of messages from consumer on [test_1] at simple_consumer_test_1-0_r2.log
 :  775
>          No. of messages from consumer on [test_1] at simple_consumer_test_1-0_r3.log
 :  775
>          No. of messages from consumer on [test_1] at simple_consumer_test_1-1_r1.log
 :  930
>          No. of messages from consumer on [test_1] at simple_consumer_test_1-1_r2.log
 :  930
>          No. of messages from consumer on [test_1] at simple_consumer_test_1-1_r3.log
 :  930
>          No. of messages from consumer on [test_1] at simple_consumer_test_1-2_r1.log
 :  1395
>          No. of messages from consumer on [test_1] at simple_consumer_test_1-2_r2.log
 :  1395
>          No. of messages from consumer on [test_1] at simple_consumer_test_1-2_r3.log
 :  1395
>          Unique messages from consumer on [test_1]  :  3100
>          Unique messages from producer on [test_1]  :  3100
>          Validate for data matched on topic [test_1]  :  PASSED
>          Validate for data matched on topic [test_1] across replicas  :  PASSED
>          Validate for merged log segment checksum in cluster [source]  :  PASSED
>          Validate index log in cluster [source]  :  FAILED
>     
>     ========================================================
>     
>     _test_case_name  :  testcase_0010
>     _test_class_name  :  ReplicaBasicTest
>     arg : bounce_broker  :  false
>     arg : broker_type  :  leader
>     arg : message_producing_free_time_sec  :  15
>     arg : num_iteration  :  1
>     arg : num_messages_to_produce_per_producer_call  :  50
>     arg : num_partition  :  3
>     arg : replica_factor  :  3
>     arg : sleep_seconds_between_producer_calls  :  1
>     validation_status  : 
>          No. of messages from consumer on [test_1] at simple_consumer_test_1-0_r1.log
 :  800
>          No. of messages from consumer on [test_1] at simple_consumer_test_1-0_r2.log
 :  800
>          No. of messages from consumer on [test_1] at simple_consumer_test_1-0_r3.log
 :  800
>          No. of messages from consumer on [test_1] at simple_consumer_test_1-1_r1.log
 :  960
>          No. of messages from consumer on [test_1] at simple_consumer_test_1-1_r2.log
 :  960
>          No. of messages from consumer on [test_1] at simple_consumer_test_1-1_r3.log
 :  960
>          No. of messages from consumer on [test_1] at simple_consumer_test_1-2_r1.log
 :  1440
>          No. of messages from consumer on [test_1] at simple_consumer_test_1-2_r2.log
 :  1440
>          No. of messages from consumer on [test_1] at simple_consumer_test_1-2_r3.log
 :  1440
>          Unique messages from consumer on [test_1]  :  3200
>          Unique messages from producer on [test_1]  :  3200
>          Validate for data matched on topic [test_1]  :  PASSED
>          Validate for data matched on topic [test_1] across replicas  :  PASSED
>          Validate for merged log segment checksum in cluster [source]  :  PASSED
>          Validate index log in cluster [source]  :  FAILED
>     
>     ========================================================
>     
>     _test_case_name  :  testcase_0011
>     _test_class_name  :  ReplicaBasicTest
>     arg : auto_create_topic  :  true
>     arg : bounce_broker  :  false
>     arg : broker_type  :  leader
>     arg : message_producing_free_time_sec  :  15
>     arg : num_iteration  :  1
>     arg : num_messages_to_produce_per_producer_call  :  50
>     arg : num_partition  :  3
>     arg : replica_factor  :  3
>     arg : sleep_seconds_between_producer_calls  :  1
>     validation_status  : 
>          No. of messages from consumer on [test_1] at simple_consumer_test_1-0_r1.log
 :  775
>          No. of messages from consumer on [test_1] at simple_consumer_test_1-0_r2.log
 :  775
>          No. of messages from consumer on [test_1] at simple_consumer_test_1-0_r3.log
 :  775
>          No. of messages from consumer on [test_1] at simple_consumer_test_1-1_r1.log
 :  930
>          No. of messages from consumer on [test_1] at simple_consumer_test_1-1_r2.log
 :  930
>          No. of messages from consumer on [test_1] at simple_consumer_test_1-1_r3.log
 :  930
>          No. of messages from consumer on [test_1] at simple_consumer_test_1-2_r1.log
 :  1395
>          No. of messages from consumer on [test_1] at simple_consumer_test_1-2_r2.log
 :  1395
>          No. of messages from consumer on [test_1] at simple_consumer_test_1-2_r3.log
 :  1395
>          Unique messages from consumer on [test_1]  :  3100
>          Unique messages from producer on [test_1]  :  3100
>          Validate for data matched on topic [test_1]  :  PASSED
>          Validate for data matched on topic [test_1] across replicas  :  PASSED
>          Validate for merged log segment checksum in cluster [source]  :  PASSED
>          Validate index log in cluster [source]  :  FAILED
>     
>     ========================================================
>     
>     _test_case_name  :  testcase_0021
>     _test_class_name  :  ReplicaBasicTest
>     arg : bounce_broker  :  false
>     arg : broker_type  :  leader
>     arg : message_producing_free_time_sec  :  15
>     arg : num_iteration  :  1
>     arg : num_partition  :  3
>     arg : replica_factor  :  3
>     arg : sleep_seconds_between_producer_calls  :  1
>     validation_status  : 
>          No. of messages from consumer on [test_1] at simple_consumer_test_1-0_r1.log
 :  675
>          No. of messages from consumer on [test_1] at simple_consumer_test_1-0_r2.log
 :  675
>          No. of messages from consumer on [test_1] at simple_consumer_test_1-0_r3.log
 :  675
>          No. of messages from consumer on [test_1] at simple_consumer_test_1-1_r1.log
 :  810
>          No. of messages from consumer on [test_1] at simple_consumer_test_1-1_r2.log
 :  810
>          No. of messages from consumer on [test_1] at simple_consumer_test_1-1_r3.log
 :  810
>          No. of messages from consumer on [test_1] at simple_consumer_test_1-2_r1.log
 :  1215
>          No. of messages from consumer on [test_1] at simple_consumer_test_1-2_r2.log
 :  1215
>          No. of messages from consumer on [test_1] at simple_consumer_test_1-2_r3.log
 :  1215
>          No. of messages from consumer on [test_2] at simple_consumer_test_2-0_r1.log
 :  675
>          No. of messages from consumer on [test_2] at simple_consumer_test_2-0_r2.log
 :  675
>          No. of messages from consumer on [test_2] at simple_consumer_test_2-0_r3.log
 :  675
>          No. of messages from consumer on [test_2] at simple_consumer_test_2-1_r1.log
 :  810
>          No. of messages from consumer on [test_2] at simple_consumer_test_2-1_r2.log
 :  810
>          No. of messages from consumer on [test_2] at simple_consumer_test_2-1_r3.log
 :  810
>          No. of messages from consumer on [test_2] at simple_consumer_test_2-2_r1.log
 :  1215
>          No. of messages from consumer on [test_2] at simple_consumer_test_2-2_r2.log
 :  1215
>          No. of messages from consumer on [test_2] at simple_consumer_test_2-2_r3.log
 :  1215
>          Unique messages from consumer on [test_1]  :  2700
>          Unique messages from consumer on [test_2]  :  2700
>          Unique messages from producer on [test_1]  :  2700
>          Unique messages from producer on [test_2]  :  2700
>          Validate for data matched on topic [test_1]  :  PASSED
>          Validate for data matched on topic [test_1] across replicas  :  PASSED
>          Validate for data matched on topic [test_2]  :  PASSED
>          Validate for data matched on topic [test_2] across replicas  :  PASSED
>          Validate for merged log segment checksum in cluster [source]  :  PASSED
>          Validate index log in cluster [source]  :  FAILED
> 
> Ewen Cheslack-Postava wrote:
>     This looks like it's just a byproduct of this patch correcting the handling of errors
when running DumpLogSegments. You can see from this debug log that DumpLogSegments is throwing
an exception because it's being run on the remote host but uses a path for a log file that's
been synced locally (the /Users/ewencp path is on my Mac, but you can see it's SSHing' to
a Vagrant VM to run the commnand):
>     
>     2014-11-10 12:09:57,665 - DEBUG - executing command [ssh vagrant@worker1 -o 'HostName
127.0.0.1' -o 'Port 2222' -o 'UserKnownHostsFile /dev/null' -o 'StrictHostKeyChecking no'
-o 'PasswordAuthentication no' -o 'IdentityFile /Users/ewencp/.vagrant.d/insecure_private_key'
-o 'IdentitiesOnly yes' -o 'LogLevel FATAL'  '/opt/kafka/bin/kafka-run-class.sh kafka.tools.DumpLogSegments
 --file /Users/ewencp/kafka.git/system_test/replication_testsuite/testcase_0008/logs/broker-3/kafka_server_3_logs/test_1-2/00000000000000001294.index
--verify-index-only 2>&1'] (system_test_utils)
>     2014-11-10 12:09:58,673 - DEBUG - Dumping /Users/ewencp/kafka.git/system_test/replication_testsuite/testcase_0008/logs/broker-3/kafka_server_3_logs/test_1-2/00000000000000001294.index
(kafka_system_test_utils)
>     2014-11-10 12:09:58,673 - DEBUG - Exception in thread "main" java.io.FileNotFoundException:
/Users/ewencp/kafka.git/system_test/replication_testsuite/testcase_0008/logs/broker-3/kafka_server_3_logs/test_1-2/00000000000000001294.log
(No such file or directory) (kafka_system_test_utils)
>     2014-11-10 12:09:58,673 - DEBUG - 	at java.io.FileInputStream.open(Native Method)
(kafka_system_test_utils)
>     2014-11-10 12:09:58,673 - DEBUG - 	at java.io.FileInputStream.<init>(FileInputStream.java:146)
(kafka_system_test_utils)
>     2014-11-10 12:09:58,673 - DEBUG - 	at kafka.utils.Utils$.openChannel(Utils.scala:162)
(kafka_system_test_utils)
>     2014-11-10 12:09:58,673 - DEBUG - 	at kafka.log.FileMessageSet.<init>(FileMessageSet.scala:74)
(kafka_system_test_utils)
>     2014-11-10 12:09:58,673 - DEBUG - 	at kafka.tools.DumpLogSegments$.kafka$tools$DumpLogSegments$$dumpIndex(DumpLogSegments.scala:108)
(kafka_system_test_utils)
>     2014-11-10 12:09:58,673 - DEBUG - 	at kafka.tools.DumpLogSegments$$anonfun$main$1.apply(DumpLogSegments.scala:80)
(kafka_system_test_utils)
>     2014-11-10 12:09:58,674 - DEBUG - 	at kafka.tools.DumpLogSegments$$anonfun$main$1.apply(DumpLogSegments.scala:73)
(kafka_system_test_utils)
>     2014-11-10 12:09:58,674 - DEBUG - 	at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
(kafka_system_test_utils)
>     2014-11-10 12:09:58,674 - DEBUG - 	at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:105)
(kafka_system_test_utils)
>     2014-11-10 12:09:58,674 - DEBUG - 	at kafka.tools.DumpLogSegments$.main(DumpLogSegments.scala:73)
(kafka_system_test_utils)
>     2014-11-10 12:09:58,674 - DEBUG - 	at kafka.tools.DumpLogSegments.main(DumpLogSegments.scala)
(kafka_system_test_utils)
>     
>     I've opened a separate issue KAFKA-1763 for the fix and I'll submit a patch soon.

Cool. Thanks.


- Mayuresh


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27534/#review60460
-----------------------------------------------------------


On Nov. 3, 2014, 7:46 p.m., Ewen Cheslack-Postava wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/27534/
> -----------------------------------------------------------
> 
> (Updated Nov. 3, 2014, 7:46 p.m.)
> 
> 
> Review request for kafka.
> 
> 
> Bugs: KAFKA-1746
>     https://issues.apache.org/jira/browse/KAFKA-1746
> 
> 
> Repository: kafka
> 
> 
> Description
> -------
> 
> KAFKA-1746 Make system tests return a useful exit code.
> 
> 
> KAFKA-1746 Check the exit code when running DumpLogSegments to verify data.
> 
> 
> Diffs
> -----
> 
>   system_test/mirror_maker_testsuite/mirror_maker_test.py c0117c64cbb7687ca8fbcec6b5c188eb880300ef

>   system_test/offset_management_testsuite/offset_management_test.py 12b5cd25140e1eb407dd57eef63d9783257688b2

>   system_test/replication_testsuite/replica_basic_test.py 660006cc253bbae3e7cd9f02601f1c1937dd1714

>   system_test/system_test_runner.py ee7aa252333553e8eb0bc046edf968ec99dddb70 
>   system_test/utils/kafka_system_test_utils.py 1093b660ebd0cb5ab6d3731d26f151e1bf717f8a

> 
> Diff: https://reviews.apache.org/r/27534/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Ewen Cheslack-Postava
> 
>


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message