impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Skye Wanderman-Milne (Code Review)" <>
Subject [Impala-CR](cdh5-trunk) IMPALA-3441: check for malformed Avro data
Date Thu, 02 Jun 2016 23:18:57 GMT
Skye Wanderman-Milne has posted comments on this change.

Change subject: IMPALA-3441: check for malformed Avro data

Patch Set 5:


I used perf top with the --perf_map option and ReadZLong() is where most of the time is spent:

  18.47%  impalad [.] _ZN6impala13ReadWriteUtil9ReadZLongEPPhPlS3_
   9.11%  impalad [.] _ZN6snappy13RawUncompressEPNS_6SourceEPc

Unfortunately I need to add even more error checking to this function (IMPALA-3659), so maybe
we should get this in and work on ReadZLong() perf in that patch.
File be/src/exec/

Line 176:     ReportInvalidValue(TErrorCode::AVRO_INVALID_LENGTH, len);

Line 180:     ReportCorruptData(TErrorCode::AVRO_TRUNCATED_BLOCK);
File be/src/exec/

Line 190:   // TODO: we don't handle invalid values (e.g. overflow)
> There's an ASSERT_DEBUG_DEATH macro that does some magic to assert that it 
Dan pointed out that we need to fix this one way or another or we still may crash on corrupt
files, so I'm gonna skip this for now since I need to fix it ASAP anyway. I'll add a test
case for it when I fix it.
File testdata/workloads/functional-query/queries/DataErrorsTest/avro-errors.test:

Line 7
> Are you intending to fix this in this patch?
Oops, forgot to address this. I'm having a lot of trouble matching these error messages with
the current test verifier logic, so I commented them out for now and meant to file a JIRA
to fix this.

To view, visit
To unsubscribe, visit

Gerrit-MessageType: comment
Gerrit-Change-Id: I801a11c496a128e02c564c2a9c44baa5a97be132
Gerrit-PatchSet: 5
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Skye Wanderman-Milne <>
Gerrit-Reviewer: Alex Behm <>
Gerrit-Reviewer: Dan Hecht <>
Gerrit-Reviewer: Skye Wanderman-Milne <>
Gerrit-Reviewer: Tim Armstrong <>
Gerrit-HasComments: Yes

View raw message