sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Qian Xu (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (SQOOP-1395) Use random generated class name for SqoopRecord
Date Mon, 01 Sep 2014 09:06:20 GMT

    [ https://issues.apache.org/jira/browse/SQOOP-1395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14117221#comment-14117221
] 

Qian Xu edited comment on SQOOP-1395 at 9/1/14 9:06 AM:
--------------------------------------------------------

[~jarcec] There are actually two places that will use reflection to do class lookup.

When a Kite's Dataset is being created, an Avro schema should be provided. In the schema,
the type is actually the table name. Kite will try to verify schema. The writeSchema is the
Avro schema. But the readerSchema will be the descent SqoopRecord entity class. 

{code}
  DataModelUtil.java

  public static <E> DatumReader<E> getDatumReaderForType(Class<E> type,
Schema writerSchema) {
    Schema readerSchema = getReaderSchema(type, writerSchema);
    GenericData dataModel = getDataModelForType(type);
{code}

When export Parquet files back to RDBMS, {{AvroIndexedRecordConverter}} will instantiate a
class regarding the avroSchema. If the record type hits our entity class name, we will be
unlucky.
{code}
  AvroIndexedRecordConverter

  public AvroIndexedRecordConverter(ParentValueContainer parent, GroupType
      parquetSchema, Schema avroSchema) {
    this.specificClass = SpecificData.get().getClass(avroSchema);
    // ...
  }

  public void start() {
    // Should do the right thing whether it is generic or specific
    this.currentRecord = (T) ((this.specificClass == null) ?
            new GenericData.Record(avroSchema) :
            SpecificData.newInstance(specificClass, avroSchema));
  }
{code}


was (Author: stanleyxu2005):
[~jarcec] There are actually two places that will use reflection to do class lookup.

When a Kite's Dataset is being created, an Avro schema should be provided. In the schema,
the type is actually the table name. Kite will try to verify schema. The writeSchema is the
Avro schema. But the readerSchema will be the descent SqoopRecord entity class. 

{{
  DataModelUtil.java

  public static <E> DatumReader<E> getDatumReaderForType(Class<E> type,
Schema writerSchema) {
    Schema readerSchema = getReaderSchema(type, writerSchema);
    GenericData dataModel = getDataModelForType(type);
}}

When export Parquet files back to RDBMS, {{AvroIndexedRecordConverter}} will instantiate a
class regarding the avroSchema. If the record type hits our entity class name, we will be
unlucky.
{{
  AvroIndexedRecordConverter

  public AvroIndexedRecordConverter(ParentValueContainer parent, GroupType
      parquetSchema, Schema avroSchema) {
    this.specificClass = SpecificData.get().getClass(avroSchema);
    // ...
  }

  public void start() {
    // Should do the right thing whether it is generic or specific
    this.currentRecord = (T) ((this.specificClass == null) ?
            new GenericData.Record(avroSchema) :
            SpecificData.newInstance(specificClass, avroSchema));
  }
}}

> Use random generated class name for SqoopRecord
> -----------------------------------------------
>
>                 Key: SQOOP-1395
>                 URL: https://issues.apache.org/jira/browse/SQOOP-1395
>             Project: Sqoop
>          Issue Type: Sub-task
>          Components: tools
>            Reporter: Qian Xu
>            Assignee: Qian Xu
>            Priority: Minor
>
> Sqoop will generate an entity class to hold values of every database record for mapreduce.
The class is inherited from the abstract class SqoopRecord. The name of the class is by default
the table name. 
> When export records as Parquet files, the internal logic will attempt to instantiate
another entity class or create it on demand. Unfortunately, the target class has the same
name of the one Sqoop generated. 
> The JIRA propose to use random class name to avoid the potential problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message