spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jen-Ming Chung (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (SPARK-22019) JavaBean int type property
Date Fri, 15 Sep 2017 11:30:00 GMT

    [ https://issues.apache.org/jira/browse/SPARK-22019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16167734#comment-16167734
] 

Jen-Ming Chung edited comment on SPARK-22019 at 9/15/17 11:29 AM:
------------------------------------------------------------------

The alternative is giving the explicit schema instead inferring that you don't need to change
your pojo class in above test case.

{code}
StructType schema = new StructType()
    .add("id", IntegerType)
    .add("str", StringType);
Dataset<SampleData> df = spark.read().schema(schema).json(stringdataset).as(
    org.apache.spark.sql.Encoders.bean(SampleData.class));
{code}



was (Author: jmchung):
The alternative is giving the explicit schema instead inferring, means you don't need to change
your pojo class.

{code}
StructType schema = new StructType()
    .add("id", IntegerType)
    .add("str", StringType);
Dataset<SampleData> df = spark.read().schema(schema).json(stringdataset).as(
    org.apache.spark.sql.Encoders.bean(SampleData.class));
{code}


> JavaBean int type property 
> ---------------------------
>
>                 Key: SPARK-22019
>                 URL: https://issues.apache.org/jira/browse/SPARK-22019
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.2.0
>            Reporter: taiho choi
>
> when the type of SampleData's id is int, following code generates errors.
> when long, it's ok.
>  
> {code:java}
>     @Test
>     public void testDataSet2() {
>         ArrayList<String> arr= new ArrayList();
>         arr.add("{\"str\": \"everyone\", \"id\": 1}");
>         arr.add("{\"str\": \"Hello\", \"id\": 1}");
>         //1.read array and change to string dataset.
>         JavaRDD<String> data = sc.parallelize(arr);
>         Dataset<String> stringdataset = sqc.createDataset(data.rdd(), Encoders.STRING());
>         stringdataset.show(); //PASS
>         //2. convert string dataset to sampledata dataset
>         Dataset<SampleData> df = sqc.read().json(stringdataset).as(Encoders.bean(SampleData.class));
>         df.show();//PASS
>         df.printSchema();//PASS
>         Dataset<SampleDataFlat> fad = df.flatMap(SampleDataFlat::flatMap, Encoders.bean(SampleDataFlat.class));
>         fad.show(); //ERROR
>         fad.printSchema();
>     }
>     public static class SampleData implements Serializable {
>         public String getStr() {
>             return str;
>         }
>         public void setStr(String str) {
>             this.str = str;
>         }
>         public int getId() {
>             return id;
>         }
>         public void setId(int id) {
>             this.id = id;
>         }
>         String str;
>         int id;
>     }
>     public static class SampleDataFlat {
>         String str;
>         public String getStr() {
>             return str;
>         }
>         public void setStr(String str) {
>             this.str = str;
>         }
>         public SampleDataFlat(String str, long id) {
>             this.str = str;
>         }
>         public static Iterator<SampleDataFlat> flatMap(SampleData data) {
>             ArrayList<SampleDataFlat> arr = new ArrayList<>();
>             arr.add(new SampleDataFlat(data.getStr(), data.getId()));
>             arr.add(new SampleDataFlat(data.getStr(), data.getId()+1));
>             arr.add(new SampleDataFlat(data.getStr(), data.getId()+2));
>             return arr.iterator();
>         }
>     }
> {code}
> ==Error message==
> Caused by: org.codehaus.commons.compiler.CompileException: File 'generated.java', Line
38, Column 16: failed to compile: org.codehaus.commons.compiler.CompileException: File 'generated.java',
Line 38, Column 16: No applicable constructor/method found for actual parameters "long"; candidates
are: "public void SparkUnitTest$SampleData.setId(int)"
> /* 024 */   public java.lang.Object apply(java.lang.Object _i) {
> /* 025 */     InternalRow i = (InternalRow) _i;
> /* 026 */
> /* 027 */     final SparkUnitTest$SampleData value1 = false ? null : new SparkUnitTest$SampleData();
> /* 028 */     this.javaBean = value1;
> /* 029 */     if (!false) {
> /* 030 */
> /* 031 */
> /* 032 */       boolean isNull3 = i.isNullAt(0);
> /* 033 */       long value3 = isNull3 ? -1L : (i.getLong(0));
> /* 034 */
> /* 035 */       if (isNull3) {
> /* 036 */         throw new NullPointerException(((java.lang.String) references[0]));
> /* 037 */       }
> /* 038 */       javaBean.setId(value3);



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message