impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Valencia Serrao" <vser...@us.ibm.com>
Subject Re: Fw: Issues with generating testdata for Impala
Date Wed, 08 Jun 2016 12:30:15 GMT
Hi Casey,

Data loading issues on ppc are resolved. I have been able to successfully
complete the data loading on ppc for Impala. The FE tests are also test
successfully with 545 tests passing and 36 tests skipped.

I also executed the Custom cluster tests, (tests=41, failures=5, errors=0,
skipped=0). PFA the log for the same. (See attached file:
8June_cc_tests.txt)

It would be great if you could share any pointers on these issues.

Regards,
Valencia





From:	Casey Ching <casey@cloudera.com>
To:	Alex Behm <alex.behm@cloudera.com>, Valencia
            Serrao/Austin/Contr/IBM@IBMUS, dev@impala.incubator.apache.org
Cc:	Valencia Serrao/Austin/Contr/IBM@IBMUS,
            dev@impala.incubator.apache.org, David
            Clissold/Austin/IBM@IBMUS, Sudarshan
            Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
            Panpaliya/Austin/Contr/IBM@IBMUS
Date:	05/09/2016 10:45 PM
Subject:	Re: Fw: Issues with generating testdata for Impala



Hi Valencia,

Have you tried setting up an x86 environment? That could be useful for
comparing to the ppc environment to see what is/isn’t working and being
able to see what the logs should look like.

If the tpch database isn’t there, that should mean data loading failed and
there should have been an error that caused the data loading to exit early
along with an error message in the logs. Did you see anything like that?
You might want to try only running the data loading step, then verifying
that the tpch database exists afterwards.

Casey


On May 9, 2016 at 5:27:49 AM, Valencia Serrao (vserrao@us.ibm.com) wrote:


      Hi Alex/Casey,

      I re-ran the fe tests with the testdata you provided, but the result
      is the same as that reported in the earlier mail, with most of the
      failures occurring due to tpch database not existing.

      Steps followed to test are as follows:
      1. copy the testdata to IMPALA_HOME/testdata/impala-data.
      2. ./buildall.sh -notests -noclean -format -testdata
      3. ./bin/run_all_tests.sh

      We had also tried the testdata generation on Ubuntu x86 ppc machine
      however, it stops at the same "Invalidate Metadata" step with the
      exception.

      Any pointers on these issues will be helpful.

      Regards,
      Valencia

      Valencia Serrao---05/05/2016 06:47:59 PM---Hi Alex/Casey, I tried to
      run the frontend tests with the data provided. Following is the
      result:

      From: Valencia Serrao/Austin/Contr/IBM
      To: Casey Ching <casey@cloudera.com>
      Cc: Alex Behm <alex.behm@cloudera.com>,
      dev@impala.incubator.apache.org, Nishidha
      Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan
      Jagadale/Austin/Contr/IBM@IBMUS, David Clissold/Austin/IBM@IBMUS,
      Valencia Serrao/Austin/Contr/IBM@IBMUS
      Date: 05/05/2016 06:47 PM
      Subject: Re: Fw: Issues with generating testdata for Impala




      Hi Alex/Casey,

      I tried to run the frontend tests with the data provided. Following
      is the result:
      Tests run: 545, Failures: 226, Errors: 77, Skipped: 36 [attachment
      "data-load-functional-exhaustive.zip" deleted by Valencia
      Serrao/Austin/Contr/IBM]


      Earlier, the number of "Errors" were 87 , so now they have reduced by
      10. However, the "Failures" count is still the same. Most of the
      Failures in PlannerTest and AuthorizationTest are related to tpch
      (e.g. Database doesn't exist: tpch).

      With regard to the directory "impala_data", i've observed that it is
      not being accessed/used by any script. Are we missing on any
      configuration ?

      Kindly guide me on this.

      Regards,
      Valencia



      Valencia Serrao---05/05/2016 02:21:56 PM---Thanks, Casey! I will let
      you know the test status.

      From: Valencia Serrao/Austin/Contr/IBM
      To: Casey Ching <casey@cloudera.com>
      Cc: Alex Behm <alex.behm@cloudera.com>,
      dev@impala.incubator.apache.org, Nishidha
      Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan
      Jagadale/Austin/Contr/IBM@IBMUS, David Clissold/Austin/IBM@IBMUS,
      Valencia Serrao/Austin/Contr/IBM@IBMUS
      Date: 05/05/2016 02:21 PM
      Subject: Re: Fw: Issues with generating testdata for Impala


      Thanks, Casey!

      I will let you know the test status.


      Casey Ching ---05/05/2016 01:09:11 PM---On May 4, 2016 at 11:08:07
      PM, Valencia Serrao (vserrao@us.ibm.com) wrote: Hi Alex,

      From: Casey Ching <casey@cloudera.com>
      To: Alex Behm <alex.behm@cloudera.com>, Valencia
      Serrao/Austin/Contr/IBM@IBMUS, dev@impala.incubator.apache.org
      Cc: Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
      Panpaliya/Austin/Contr/IBM@IBMUS, dev@impala.incubator.apache.org
      Date: 05/05/2016 01:09 PM
      Subject: Re: Fw: Issues with generating testdata for Impala







      On May 4, 2016 at 11:08:07 PM, Valencia Serrao (vserrao@us.ibm.com)
      wrote:


              Hi Alex,

              I've placed the individual testdata tars at the
              IMPALA_HOME/testdata/impala-data. Steps 1...10 i've already
              executed. Some queries about step no:11 and step no:12, that
              i want to clarify:

              1) . bin/impala-config.sh
              2) mkdir -p $IMPALA_HOME/testdata/impala-data
              3) pushd $IMPALA_HOME/testdata/impala-data
              4) cat /tmp/tpch.tar.gz{0..6} > tpch.tar.gz
              5) tar -xzf tpch.tar.gz
              6) rm tpch.tar.gz
              7) cat /tmp/tpcds.tar.gz{0..3} > tpcds.tar.gz
              8) tar -xzf tpcds.tar.gz
              9) rm tpcds.tar.gz
              10) popd

              11) ./buildall.sh -notests -noclean -format
              -----Here I've removed the -testdata option.
              The reason to do this is to clear the previously generated
              partial schemas.
      I think the -format option is supposed to clear out any old state.
      The -testdata flag is probably needed to generate and load the test
      data.


              12) sudo rm -rf $IMPALA_HOME/testdata/impala-data ---- Is
              this step required? Why?
      That is only for docker. It helps to reduct the image size. You
      shouldn’t need to do that or any of the other rm commands.


              Could you kindly confirm on these steps ? If any corrections,
              please let me know.

              Regards,
              Valencia



              Valencia Serrao---05/04/2016 04:18:24 PM---Hi Alex/Casey
              Thank you for responding and for sharing the testdata. I'm
              working on using the testda

              From: Valencia Serrao/Austin/Contr/IBM
              To: Alex Behm <alex.behm@cloudera.com>
              Cc: Casey Ching <casey@cloudera.com>,
              dev@impala.incubator.apache.org, Nishidha
              Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan
              Jagadale/Austin/Contr/IBM@IBMUS, David
              Clissold/Austin/IBM@IBMUS
              Date: 05/04/2016 04:18 PM
              Subject: Re: Fw: Issues with generating testdata for Impala




              Hi Alex/Casey

              Thank you for responding and for sharing the testdata. I'm
              working on using the testdata to run the fe tests.

              Meanwhile, I've posted the logs onto "Impala Dev" google
              group. Here's the link:
              https://groups.google.com/a/cloudera.org/forum/#!topic/impala-dev/zy05cHNrACk


              Regards,
              Valencia


              Alex Behm ---05/04/2016 12:52:44 PM---Ahh, thanks Casey. Did
              not know about that. Valencia, Impala's data loading expects
              the files to be

              From: Alex Behm <alex.behm@cloudera.com>
              To: Casey Ching <casey@cloudera.com>
              Cc: dev@impala.incubator.apache.org, Sudarshan
              Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
              Panpaliya/Austin/Contr/IBM@IBMUS, Valencia
              Serrao/Austin/Contr/IBM@IBMUS
              Date: 05/04/2016 12:52 PM
              Subject: Re: Fw: Issues with generating testdata for Impala



              Ahh, thanks Casey. Did not know about that.

              Valencia, Impala's data loading expects the files to be
              placed in IMPALA_HOME/testdata/impala-data

              On Tue, May 3, 2016 at 11:21 PM, Casey Ching <
              casey@cloudera.com> wrote:


                  Comment inline below


                  On May 3, 2016 at 11:18:06 PM, Alex Behm (
                  alex.behm@cloudera.com) wrote:


                              Hi Valencia,

                              I'm sorry you are having so much trouble with
                              our setup. Let's see what we
                              can do.

                              There was an infra issue with receiving the
                              logs you sent me. The
                              email/attachment got rejected on our side.
                              Maybe you can upload the logs
                              somewhere so I can grab them?

                              See more responses inline below.

                              On Sat, Apr 30, 2016 at 5:01 AM, Valencia
                              Serrao <vserrao@us.ibm.com> wrote:

                              > Hi Alex,
                              >
                              > I was going more deeper through the logs. I
                              have some findings and queries:
                              >
                              > 1. At the "Invalidating Metadata" step (as
                              mentioned in below mail), i
                              > noticed that, it is trying to use kerberos.
                              Perhaps, this is preventing the
                              > testdata generation from proceeding, as we
                              are not using Kerberos.
                              > I need to know how this can be done without
                              involving Kerberos support ?
                              >
                              Kerberos is certainly not needed to build and
                              run tests.

                              >
                              > 2. I had executed the fe tests despite the
                              incomplete testdata generation,
                              > the tests started and surely have failed.
                              Many of these (null pointer
                              > exception in AuthorzationTests) have a
                              common cause: "tpch database does
                              > not exist."
                              > e.g. as shown
                              in .Impala/cluster_logs/query_tests/test-run-workload.log.

                              >
                              > Does the "tpch" database gets created after
                              the current blocker step
                              > "Invalidating Metadata" ?
                              >

                              Yes, the TPCH database is created and loaded
                              as part of that first phase.
                              However, the data files are not yet publicly
                              accessible. Let me work on
                              that from my side, and get back to you soon.
                              One way or the other we'll be
                              able to provide you with the data.

                  The data is at
                  https://github.com/cloudera/Impala-docker-hub/tree/master/prereqs/container_root/tmp
                   . The files are split into 50 MB pieces for git. You can
                  put them back together as is done in
                  https://github.com/cloudera/Impala-docker-hub/blob/master/complete/Dockerfile


                              >
                              > 3. In the fe test console output log,
                              another error shown:
                              > ============================= test session
                              starts
                              > ==============================
                              > platform linux2 -- Python 2.7.5 --
                              py-1.4.30 -- pytest-2.7.2
                              > rootdir: /work/, inifile:
                              > plugins: random, xdist
                              > ERROR: file not found:/work/I
                              >
                              mpala/../Impala-auxiliary-tests/tests/aux_custom_cluster_tests/

                              >
                              > These are not present/created on my vm. May
                              i know when these get created ?
                              >
                              > 4. Could you also share the total number of
                              fe tests ?
                              >

                              I'll privately send you the console output
                              from a successful FE run.
                              Hopefully that can help.

                              Cheers,

                              Alex

                              >
                              >
                              > Looking forward to your reply.
                              >
                              > Regards,
                              > Valencia
                              >
                              >
                              > [image: Inactive hide details for Valencia
                              Serrao---04/30/2016 09:05:54
                              > AM---Hi Alex, I've been able to make some
                              progress on testdata]Valencia
                              > Serrao---04/30/2016 09:05:54 AM---Hi Alex,
                              I've been able to make some
                              > progress on testdata generation, however, i
                              still face the foll
                              >
                              > From: Valencia Serrao/Austin/Contr/IBM
                              > To: dev@impala.incubator.apache.org, Alex
                              Behm <alex.behm@cloudera.com>
                              > Cc: Sudarshan
                              Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
                              > Panpaliya/Austin/Contr/IBM@IBMUS, Valencia
                              Serrao/Austin/Contr/IBM@IBMUS
                              > Date: 04/30/2016 09:05 AM
                              > Subject: Fw: Issues with generating
                              testdata for Impala
                              > ------------------------------
                              >
                              >
                              >
                              > Hi Alex,
                              >
                              > I've been able to make some progress on
                              testdata generation, however, i
                              > still face the following issues:
                              >
                              >
                              >
                              *******************************************************************************************************************************************************************

                              > Invalidating Metadata
                              >
                              >
                              (load-functional-query-exhaustive-impala-load-generated-parquet-none-none.sql):

                              > INSERT OVERWRITE TABLE
                              functional_parquet.alltypes partition (year,
                              month)
                              > SELECT id, bool_col, tinyint_col,
                              smallint_col, int_col, bigint_col,
                              > float_col, double_col, date_string_col,
                              string_col, timestamp_col, year,
                              > month
                              > FROM functional.alltypes
                              >
                              > Data Loading from Impala failed with error:
                              ImpalaBeeswaxException:
                              > INNER EXCEPTION: <class 'socket.error'>
                              > MESSAGE: [Errno 104] Connection reset by
                              peer
                              > Error
                              in /root/nishidha/Impala/testdata/bin/create-load-data.sh
 at line
                              > 41: while [ -n "$*" ]
                              > Error in /root/nishidha/Impala/buildall.sh
                              at line 368:
                              > $
                              {IMPALA_HOME}/testdata/bin/create-load-data.sh
 ${CREATE_LOAD_DATA_ARGS}
                              > <<< Y
                              >
                              >
                              *************************************************************************************************************************************************************************

                              >
                              > i continued with fe tests as is. Here is
                              the complete output log.
                              > [attachment "fe_test_output.zip" deleted by
                              Valencia
                              > Serrao/Austin/Contr/IBM]
                              >
                              > Cluster logs: [attachment "cluster_logs.7z"
                              deleted by Valencia
                              > Serrao/Austin/Contr/IBM]
                              >
                              > Kindly guide me on the same.
                              >
                              > Regards,
                              > Valencia
                              > ----- Forwarded by Valencia
                              Serrao/Austin/Contr/IBM on 04/29/2016 10:57
                              AM
                              > -----
                              >
                              > From: Sudarshan Jagadale/Austin/Contr/IBM
                              > To: Valencia Serrao/Austin/Contr/IBM@IBMUS
                              > Date: 04/29/2016 10:49 AM
                              > Subject: Fw: Issues with generating
                              testdata for Impala
                              > ------------------------------
                              >
                              >
                              > FYI
                              > Thanks and Regards
                              > Sudarshan Jagadale
                              > Power Open Source Solutions
                              > ----- Forwarded by Sudarshan
                              Jagadale/Austin/Contr/IBM on 04/29/2016 10:48
                              > AM -----
                              >
                              > From: Alex Behm <alex.behm@cloudera.com>
                              > To: dev@impala.incubator.apache.org
                              > Cc: Sudarshan
                              Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
                              > Panpaliya/Austin/Contr/IBM@IBMUS
                              > Date: 04/28/2016 09:34 PM
                              > Subject: Re: Issues with generating
                              testdata for Impala
                              > ------------------------------
                              >
                              >
                              >
                              > Hi Valencia,
                              >
                              > sorry I did not get the attachment. Would
                              you be able to tar.gz and attach
                              > the whole cluster_logs directory?
                              >
                              > Alex
                              >
                              > On Thu, Apr 28, 2016 at 6:23 AM, Valencia
                              Serrao <*vserrao@us.ibm.com*
                              > <vserrao@us.ibm.com>> wrote:
                              >
                              > Hi Alex,
                              >
                              > I tried building impala again with the
                              following:
                              > HDFS CDH 5.7.0 (
                              > *
                              http://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball_57.html#topic_3*

                              > <
                              http://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball_57.html#topic_3
                              >
                              > )
                              > HBASE CDH 5.7.0 SNAPSHOT (
                              > *
                              http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.7.0.tar.gz*

                              > <
                              http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.7.0.tar.gz
                              > )
                              > - this required to patch in a fix (
                              > *
                              https://issues.apache.org/jira/secure/attachment/12792536/HBASE-15322-branch-1.2.patch*

                              > <
                              https://issues.apache.org/jira/secure/attachment/12792536/HBASE-15322-branch-1.2.patch
                              >
                              > )
                              > HIVE CDH 5.8.0 SNAPSHOT
                              >
                              > With the above combination, i'm able to
                              move past the exception and
                              > also have the RegionServer service up and
                              running. However, it now gives
                              > error as below:
                              >
                              >
                              >
                              ********************************************************************************************************************

                              >
                              (load-functional-query-exhaustive-impala-generated-text-none-none.sql):

                              > CREATE EXTERNAL TABLE IF NOT EXISTS
                              functional.decimal_tbl (
                              > d1 DECIMAL,
                              > d2 DECIMAL(10, 0),
                              > d3 DECIMAL(20, 10),
                              > d4 DECIMAL(38, 38),
                              > d5 DECIMAL(10, 5))
                              > PARTITIONED BY (d6 DECIMAL(9, 0))
                              > ROW FORMAT delimited fields terminated by
                              ','
                              > STORED AS TEXTFILE
                              > LOCATION '/test-warehouse/decimal_tbl'
                              >
                              >
                              (load-functional-query-exhaustive-impala-generated-text-none-none.sql):

                              > USE functional
                              >
                              >
                              (load-functional-query-exhaustive-impala-generated-text-none-none.sql):

                              > ALTER TABLE decimal_tbl ADD IF NOT EXISTS
                              PARTITION(d6=1)
                              >
                              > Data Loading from Impala failed with error:
                              ImpalaBeeswaxException:
                              > INNER EXCEPTION: <class
                              >
                              'impala._thrift_gen.beeswax.ttypes.BeeswaxException'>

                              > MESSAGE:
                              > Error: null
                              >
                              >
                              ******************************************************************************************************************

                              >
                              > Here is the complete log for the same.
                              *(See attached file:
                              > data-load-functional-exhaustive.log)*
                              >
                              > It would great if you could guide me on
                              this issue, so i could proceed
                              > with the fe tests.
                              >
                              > Still awaiting link to the source code of
                              HDFS CDH 5.8.0
                              >
                              > Regards,
                              > Valencia
                              >
                              >
                              >
                              >




Mime
View raw message