From issues-return-179698-archive-asf-public=cust-asf.ponee.io@hive.apache.org Thu Feb 20 18:44:02 2020 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by mx-eu-01.ponee.io (Postfix) with SMTP id 7B8E218065D for ; Thu, 20 Feb 2020 19:44:02 +0100 (CET) Received: (qmail 23312 invoked by uid 500); 20 Feb 2020 18:44:01 -0000 Mailing-List: contact issues-help@hive.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hive.apache.org Delivered-To: mailing list issues@hive.apache.org Received: (qmail 23299 invoked by uid 99); 20 Feb 2020 18:44:01 -0000 Received: from mailrelay1-us-west.apache.org (HELO mailrelay1-us-west.apache.org) (209.188.14.139) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 20 Feb 2020 18:44:01 +0000 Received: from jira-he-de.apache.org (static.172.67.40.188.clients.your-server.de [188.40.67.172]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id DD7D6E29AA for ; Thu, 20 Feb 2020 18:44:00 +0000 (UTC) Received: from jira-he-de.apache.org (localhost.localdomain [127.0.0.1]) by jira-he-de.apache.org (ASF Mail Server at jira-he-de.apache.org) with ESMTP id 5C99F78004A for ; Thu, 20 Feb 2020 18:44:00 +0000 (UTC) Date: Thu, 20 Feb 2020 18:44:00 +0000 (UTC) From: "ASF GitHub Bot (Jira)" To: issues@hive.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Work logged] (HIVE-21218) KafkaSerDe doesn't support topics created via Confluent Avro serializer MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HIVE-21218?focusedWorklogId=3D= 390164&page=3Dcom.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpa= nel#worklog-390164 ] ASF GitHub Bot logged work on HIVE-21218: ----------------------------------------- Author: ASF GitHub Bot Created on: 20/Feb/20 18:43 Start Date: 20/Feb/20 18:43 Worklog Time Spent: 10m=20 Work Description: cricket007 commented on pull request #526: HIVE-212= 18: KafkaSerDe doesn't support topics created via Confluent URL: https://github.com/apache/hive/pull/526#discussion_r255559248 =20 =20 ########## File path: kafka-handler/src/java/org/apache/hadoop/hive/kafka/KafkaSerDe.= java ########## @@ -133,12 +134,24 @@ Preconditions.checkArgument(!schemaFromProperty.isEmpty(), "Avro Sch= ema is empty Can not go further"); Schema schema =3D AvroSerdeUtils.getSchemaFor(schemaFromProperty); LOG.debug("Building Avro Reader with schema {}", schemaFromProperty)= ; - bytesConverter =3D new AvroBytesConverter(schema); + bytesConverter =3D getByteConverterForAvroDelegate(schema, tbl); } else { bytesConverter =3D new BytesWritableConverter(); } } =20 + BytesConverter getByteConverterForAvroDelegate(Schema schema, Properties= tbl) { + String avroByteConverterType =3D tbl.getProperty(AvroSerdeUtils.AvroTa= bleProperties.AVRO_SERDE_TYPE + .getPropName(), "= none"); + int avroSkipBytes =3D Integer.getInteger(tbl.getProperty(AvroSerdeUtil= s.AvroTableProperties.AVRO_SERDE_SKIP_BYTES + .getPropName(), "= 5")); + switch ( avroByteConverterType ) { + case "confluent" : return new AvroSkipBytesConverter(schema, 5); + case "skip" : return new AvroSkipBytesConverter(schema, avroSkipByte= s); + default : return new AvroBytesConverter(schema); =20 Review comment: Would it be better if this were an enum rather than a string comparison?= =20 =20 ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. =20 For queries about this service, please contact Infrastructure at: users@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 390164) Time Spent: 4h (was: 3h 50m) > KafkaSerDe doesn't support topics created via Confluent Avro serializer > ----------------------------------------------------------------------- > > Key: HIVE-21218 > URL: https://issues.apache.org/jira/browse/HIVE-21218 > Project: Hive > Issue Type: Bug > Components: kafka integration, Serializers/Deserializers > Affects Versions: 3.1.1 > Reporter: Milan Baran > Assignee: Milan Baran > Priority: Major > Labels: pull-request-available > Attachments: HIVE-21218.2.patch, HIVE-21218.patch > > Time Spent: 4h > Remaining Estimate: 0h > > According to [Google groups|https://groups.google.com/forum/#!topic/confl= uent-platform/JYhlXN0u9_A]=C2=A0the Confluent avro serialzier uses properti= ary format=C2=A0for kafka value -=C2=A0<4 bytes of schema = ID>.=C2=A0 > This format does not cause any problem for Confluent kafka deserializer w= hich respect the format however for hive kafka handler its bit a problem to= correctly deserialize kafka value, because Hive uses custom deserializer f= rom bytes to objects and ignores kafka consumer ser/deser classes provided = via table property. > It would be nice to support Confluent format with magic byte. > Also it would be great to support Schema registry as well. -- This message was sent by Atlassian Jira (v8.3.4#803005)