impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nishidha Panpaliya" <nishi...@us.ibm.com>
Subject Re: Fw: Issues with generating testdata for Impala
Date Fri, 06 May 2016 12:38:44 GMT
Hello,

Today, we tried building Impala on Ubuntu 15.04 x86_64 using Impala's
toolchain. And unfortunately, test data generation is failed there as well.

I think we are missing at some setup step due to which we see the issue on
both platforms (x86 as well ppc). It would be great if you could provide us
any document with build and test instructions, just to verify our setup.

Anup will be sending the latest logs on x86.

Thanks,
Nishidha



From:	Valencia Serrao/Austin/Contr/IBM
To:	Casey Ching <casey@cloudera.com>
Cc:	Alex Behm <alex.behm@cloudera.com>,
            dev@impala.incubator.apache.org, Nishidha
            Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan
            Jagadale/Austin/Contr/IBM@IBMUS, David
            Clissold/Austin/IBM@IBMUS, Valencia
            Serrao/Austin/Contr/IBM@IBMUS
Date:	05/05/2016 06:47 PM
Subject:	Re: Fw: Issues with generating testdata for Impala


Hi Alex/Casey,

I tried to run the frontend tests with the data provided. Following is the
result:
	Tests run: 545, Failures: 226, Errors: 77, Skipped: 36    (See
attached file: data-load-functional-exhaustive.zip)


Earlier, the number of "Errors" were 87 , so now they have reduced by 10.
However, the "Failures" count is still the same. Most of the Failures in
PlannerTest and AuthorizationTest are related to tpch (e.g. Database
doesn't exist: tpch).

With regard to the directory "impala_data", i've observed that it is not
being accessed/used by any script. Are we missing on any configuration ?

Kindly guide me on this.

Regards,
Valencia





From:	Valencia Serrao/Austin/Contr/IBM
To:	Casey Ching <casey@cloudera.com>
Cc:	Alex Behm <alex.behm@cloudera.com>,
            dev@impala.incubator.apache.org, Nishidha
            Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan
            Jagadale/Austin/Contr/IBM@IBMUS, David
            Clissold/Austin/IBM@IBMUS, Valencia
            Serrao/Austin/Contr/IBM@IBMUS
Date:	05/05/2016 02:21 PM
Subject:	Re: Fw: Issues with generating testdata for Impala


Thanks, Casey!

I will let you know the test status.




From:	Casey Ching <casey@cloudera.com>
To:	Alex Behm <alex.behm@cloudera.com>, Valencia
            Serrao/Austin/Contr/IBM@IBMUS, dev@impala.incubator.apache.org
Cc:	Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
            Panpaliya/Austin/Contr/IBM@IBMUS,
            dev@impala.incubator.apache.org
Date:	05/05/2016 01:09 PM
Subject:	Re: Fw: Issues with generating testdata for Impala








On May 4, 2016 at 11:08:07 PM, Valencia Serrao (vserrao@us.ibm.com) wrote:


      Hi Alex,

      I've placed the individual testdata tars at the
      IMPALA_HOME/testdata/impala-data. Steps 1...10 i've already executed.
      Some queries about step no:11 and step no:12, that i want to clarify:

      1) . bin/impala-config.sh
      2) mkdir -p $IMPALA_HOME/testdata/impala-data
      3) pushd $IMPALA_HOME/testdata/impala-data
      4) cat /tmp/tpch.tar.gz{0..6} > tpch.tar.gz
      5) tar -xzf tpch.tar.gz
      6) rm tpch.tar.gz
      7) cat /tmp/tpcds.tar.gz{0..3} > tpcds.tar.gz
      8) tar -xzf tpcds.tar.gz
      9) rm tpcds.tar.gz
      10) popd

      11) ./buildall.sh -notests -noclean -format
      -----Here I've removed the -testdata option.
      The reason to do this is to clear the previously generated partial
      schemas.


I think the -format option is supposed to clear out any old state. The
-testdata flag is probably needed to generate and load the test data.




      12) sudo rm -rf $IMPALA_HOME/testdata/impala-data ---- Is this step
      required? Why?


That is only for docker. It helps to reduct the image size. You shouldn’t
need to do that or any of the other rm commands.




      Could you kindly confirm on these steps ? If any corrections, please
      let me know.

      Regards,
      Valencia



       Valencia Serrao---05/04/2016 04:18:24 PM---Hi Alex/Casey Thank you
      for responding and for sharing the testdata. I'm working on using the
      testda

      From: Valencia Serrao/Austin/Contr/IBM
      To: Alex Behm <alex.behm@cloudera.com>
      Cc: Casey Ching <casey@cloudera.com>,
      dev@impala.incubator.apache.org, Nishidha
      Panpaliya/Austin/Contr/IBM@IBMUS, Sudarshan
      Jagadale/Austin/Contr/IBM@IBMUS, David Clissold/Austin/IBM@IBMUS
      Date: 05/04/2016 04:18 PM
      Subject: Re: Fw: Issues with generating testdata for Impala




      Hi Alex/Casey

      Thank you for responding and for sharing the testdata. I'm working on
      using the testdata to run the fe tests.

      Meanwhile, I've posted the logs onto "Impala Dev" google group.
      Here's the link:
      https://groups.google.com/a/cloudera.org/forum/#!topic/impala-dev/zy05cHNrACk


      Regards,
      Valencia


       Alex Behm ---05/04/2016 12:52:44 PM---Ahh, thanks Casey. Did not
      know about that. Valencia, Impala's data loading expects the files to
      be

      From: Alex Behm <alex.behm@cloudera.com>
      To: Casey Ching <casey@cloudera.com>
      Cc: dev@impala.incubator.apache.org, Sudarshan
      Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
      Panpaliya/Austin/Contr/IBM@IBMUS, Valencia
      Serrao/Austin/Contr/IBM@IBMUS
      Date: 05/04/2016 12:52 PM
      Subject: Re: Fw: Issues with generating testdata for Impala



      Ahh, thanks Casey. Did not know about that.

      Valencia, Impala's data loading expects the files to be placed in
      IMPALA_HOME/testdata/impala-data

      On Tue, May 3, 2016 at 11:21 PM, Casey Ching <casey@cloudera.com>
      wrote:
          Comment inline below



          On May 3, 2016 at 11:18:06 PM, Alex Behm (alex.behm@cloudera.com)
          wrote:


                  Hi Valencia,

                  I'm sorry you are having so much trouble with our setup.
                  Let's see what we
                  can do.

                  There was an infra issue with receiving the logs you sent
                  me. The
                  email/attachment got rejected on our side. Maybe you can
                  upload the logs
                  somewhere so I can grab them?

                  See more responses inline below.

                  On Sat, Apr 30, 2016 at 5:01 AM, Valencia Serrao <
                  vserrao@us.ibm.com> wrote:

                  > Hi Alex,
                  >
                  > I was going more deeper through the logs. I have some
                  findings and queries:
                  >
                  > 1. At the "Invalidating Metadata" step (as mentioned in
                  below mail), i
                  > noticed that, it is trying to use kerberos. Perhaps,
                  this is preventing the
                  > testdata generation from proceeding, as we are not
                  using Kerberos.
                  > I need to know how this can be done without involving
                  Kerberos support ?
                  >
                  Kerberos is certainly not needed to build and run tests.

                  >
                  > 2. I had executed the fe tests despite the incomplete
                  testdata generation,
                  > the tests started and surely have failed. Many of these
                  (null pointer
                  > exception in AuthorzationTests) have a common cause:
                  "tpch database does
                  > not exist."
                  > e.g. as shown
                  in .Impala/cluster_logs/query_tests/test-run-workload.log.

                  >
                  > Does the "tpch" database gets created after the current
                  blocker step
                  > "Invalidating Metadata" ?
                  >

                  Yes, the TPCH database is created and loaded as part of
                  that first phase.
                  However, the data files are not yet publicly accessible.
                  Let me work on
                  that from my side, and get back to you soon. One way or
                  the other we'll be
                  able to provide you with the data.

          The data is at
          https://github.com/cloudera/Impala-docker-hub/tree/master/prereqs/container_root/tmp
           . The files are split into 50 MB pieces for git. You can put
          them back together as is done in
          https://github.com/cloudera/Impala-docker-hub/blob/master/complete/Dockerfile

                  >
                  > 3. In the fe test console output log, another error
                  shown:
                  > ============================= test session starts
                  > ==============================
                  > platform linux2 -- Python 2.7.5 -- py-1.4.30 --
                  pytest-2.7.2
                  > rootdir: /work/, inifile:
                  > plugins: random, xdist
                  > ERROR: file not found:/work/I
                  >
                  mpala/../Impala-auxiliary-tests/tests/aux_custom_cluster_tests/

                  >
                  > These are not present/created on my vm. May i know when
                  these get created ?
                  >
                  > 4. Could you also share the total number of fe tests ?
                  >

                  I'll privately send you the console output from a
                  successful FE run.
                  Hopefully that can help.

                  Cheers,

                  Alex

                  >
                  >
                  > Looking forward to your reply.
                  >
                  > Regards,
                  > Valencia
                  >
                  >
                  > [image: Inactive hide details for Valencia
                  Serrao---04/30/2016 09:05:54
                  > AM---Hi Alex, I've been able to make some progress on
                  testdata]Valencia
                  > Serrao---04/30/2016 09:05:54 AM---Hi Alex, I've been
                  able to make some
                  > progress on testdata generation, however, i still face
                  the foll
                  >
                  > From: Valencia Serrao/Austin/Contr/IBM
                  > To: dev@impala.incubator.apache.org, Alex Behm <
                  alex.behm@cloudera.com>
                  > Cc: Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
                  > Panpaliya/Austin/Contr/IBM@IBMUS, Valencia
                  Serrao/Austin/Contr/IBM@IBMUS
                  > Date: 04/30/2016 09:05 AM
                  > Subject: Fw: Issues with generating testdata for Impala
                  > ------------------------------
                  >
                  >
                  >
                  > Hi Alex,
                  >
                  > I've been able to make some progress on testdata
                  generation, however, i
                  > still face the following issues:
                  >
                  >
                  >
                  *******************************************************************************************************************************************************************

                  > Invalidating Metadata
                  >
                  >
                  (load-functional-query-exhaustive-impala-load-generated-parquet-none-none.sql):

                  > INSERT OVERWRITE TABLE functional_parquet.alltypes
                  partition (year, month)
                  > SELECT id, bool_col, tinyint_col, smallint_col,
                  int_col, bigint_col,
                  > float_col, double_col, date_string_col, string_col,
                  timestamp_col, year,
                  > month
                  > FROM functional.alltypes
                  >
                  > Data Loading from Impala failed with error:
                  ImpalaBeeswaxException:
                  > INNER EXCEPTION: <class 'socket.error'>
                  > MESSAGE: [Errno 104] Connection reset by peer
                  > Error
                  in /root/nishidha/Impala/testdata/bin/create-load-data.sh
                  at line
                  > 41: while [ -n "$*" ]
                  > Error in /root/nishidha/Impala/buildall.sh at line 368:
                  > ${IMPALA_HOME}/testdata/bin/create-load-data.sh $
                  {CREATE_LOAD_DATA_ARGS}
                  > <<< Y
                  >
                  >
                  *************************************************************************************************************************************************************************

                  >
                  > i continued with fe tests as is. Here is the complete
                  output log.
                  > [attachment "fe_test_output.zip" deleted by Valencia
                  > Serrao/Austin/Contr/IBM]
                  >
                  > Cluster logs: [attachment "cluster_logs.7z" deleted by
                  Valencia
                  > Serrao/Austin/Contr/IBM]
                  >
                  > Kindly guide me on the same.
                  >
                  > Regards,
                  > Valencia
                  > ----- Forwarded by Valencia Serrao/Austin/Contr/IBM on
                  04/29/2016 10:57 AM
                  > -----
                  >
                  > From: Sudarshan Jagadale/Austin/Contr/IBM
                  > To: Valencia Serrao/Austin/Contr/IBM@IBMUS
                  > Date: 04/29/2016 10:49 AM
                  > Subject: Fw: Issues with generating testdata for Impala
                  > ------------------------------
                  >
                  >
                  > FYI
                  > Thanks and Regards
                  > Sudarshan Jagadale
                  > Power Open Source Solutions
                  > ----- Forwarded by Sudarshan Jagadale/Austin/Contr/IBM
                  on 04/29/2016 10:48
                  > AM -----
                  >
                  > From: Alex Behm <alex.behm@cloudera.com>
                  > To: dev@impala.incubator.apache.org
                  > Cc: Sudarshan Jagadale/Austin/Contr/IBM@IBMUS, Nishidha
                  > Panpaliya/Austin/Contr/IBM@IBMUS
                  > Date: 04/28/2016 09:34 PM
                  > Subject: Re: Issues with generating testdata for Impala
                  > ------------------------------
                  >
                  >
                  >
                  > Hi Valencia,
                  >
                  > sorry I did not get the attachment. Would you be able
                  to tar.gz and attach
                  > the whole cluster_logs directory?
                  >
                  > Alex
                  >
                  > On Thu, Apr 28, 2016 at 6:23 AM, Valencia Serrao <*
                  vserrao@us.ibm.com*
                  > <vserrao@us.ibm.com>> wrote:
                  >
                  > Hi Alex,
                  >
                  > I tried building impala again with the following:
                  > HDFS CDH 5.7.0 (
                  > *
                  http://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball_57.html#topic_3*

                  > <
                  http://www.cloudera.com/documentation/enterprise/release-notes/topics/cdh_vd_cdh_package_tarball_57.html#topic_3
                  >
                  > )
                  > HBASE CDH 5.7.0 SNAPSHOT (
                  > *
                  http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.7.0.tar.gz*

                  > <
                  http://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.7.0.tar.gz
                  > )
                  > - this required to patch in a fix (
                  > *
                  https://issues.apache.org/jira/secure/attachment/12792536/HBASE-15322-branch-1.2.patch*

                  > <
                  https://issues.apache.org/jira/secure/attachment/12792536/HBASE-15322-branch-1.2.patch
                  >
                  > )
                  > HIVE CDH 5.8.0 SNAPSHOT
                  >
                  > With the above combination, i'm able to move past the
                  exception and
                  > also have the RegionServer service up and running.
                  However, it now gives
                  > error as below:
                  >
                  >
                  >
                  ********************************************************************************************************************

                  >
                  (load-functional-query-exhaustive-impala-generated-text-none-none.sql):

                  > CREATE EXTERNAL TABLE IF NOT EXISTS
                  functional.decimal_tbl (
                  > d1 DECIMAL,
                  > d2 DECIMAL(10, 0),
                  > d3 DECIMAL(20, 10),
                  > d4 DECIMAL(38, 38),
                  > d5 DECIMAL(10, 5))
                  > PARTITIONED BY (d6 DECIMAL(9, 0))
                  > ROW FORMAT delimited fields terminated by ','
                  > STORED AS TEXTFILE
                  > LOCATION '/test-warehouse/decimal_tbl'
                  >
                  >
                  (load-functional-query-exhaustive-impala-generated-text-none-none.sql):

                  > USE functional
                  >
                  >
                  (load-functional-query-exhaustive-impala-generated-text-none-none.sql):

                  > ALTER TABLE decimal_tbl ADD IF NOT EXISTS PARTITION
                  (d6=1)
                  >
                  > Data Loading from Impala failed with error:
                  ImpalaBeeswaxException:
                  > INNER EXCEPTION: <class
                  > 'impala._thrift_gen.beeswax.ttypes.BeeswaxException'>
                  > MESSAGE:
                  > Error: null
                  >
                  >
                  ******************************************************************************************************************

                  >
                  > Here is the complete log for the same. *(See attached
                  file:
                  > data-load-functional-exhaustive.log)*
                  >
                  > It would great if you could guide me on this issue, so
                  i could proceed
                  > with the fe tests.
                  >
                  > Still awaiting link to the source code of HDFS CDH
                  5.8.0
                  >
                  > Regards,
                  > Valencia
                  >
                  >
                  >
                  >


Mime
View raw message