impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bharath Vissapragada (Code Review)" <>
Subject [Impala-ASF-CR] IMPALA-5325: Do not update totalHdfsBytes /numHdfsFiles on Catalogd
Date Thu, 25 May 2017 18:09:23 GMT
Bharath Vissapragada has posted comments on this change.

Change subject: IMPALA-5325: Do not update totalHdfsBytes_/numHdfsFiles_ on Catalogd

Patch Set 5: Code-Review-1

I see how this can break the tableSample patch tests.  We do the following in HdfsTable#getFileSample()

public Map<Long, List<FileDescriptor>> getFilesSample(List<HdfsPartition>
      long percentBytes, long randomSeed) {
    Preconditions.checkState(percentBytes >= 0 && percentBytes <= 100);

    // Conservative max size for Java arrays. The actual maximum varies
    // from JVM version and sometimes between configurations.
    final long JVM_MAX_ARRAY_SIZE = Integer.MAX_VALUE - 10;
    if (numHdfsFiles_ > JVM_MAX_ARRAY_SIZE) {
      throw new IllegalStateException(String.format(
          "Too many files to generate a table sample. " +
          "Table '%s' has %s files, but a maximum of %s files are supported.",
          getTableName().toString(), numHdfsFiles_, JVM_MAX_ARRAY_SIZE));
    int totalNumFiles = (int) numHdfsFiles_;  <-----

The last line breaks only in tests because ImpaladTestCatalog#getTable() directly returns
Catalog's version of HdfsTable during tests and doesn't rebuild it from Thrift. So basically
the above line throws an exception.

Alex, should we change the approach and fix the accounting instead?

To view, visit
To unsubscribe, visit

Gerrit-MessageType: comment
Gerrit-Change-Id: I03cc6f9e9b2c03cafb87029ea0802dfdb2745be1
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Bharath Vissapragada <>
Gerrit-Reviewer: Alex Behm <>
Gerrit-Reviewer: Bharath Vissapragada <>
Gerrit-Reviewer: Dimitris Tsirogiannis <>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-HasComments: No

View raw message