Return-Path: X-Original-To: apmail-avro-dev-archive@www.apache.org Delivered-To: apmail-avro-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D3488997A for ; Fri, 17 Feb 2012 21:37:18 +0000 (UTC) Received: (qmail 83187 invoked by uid 500); 17 Feb 2012 21:37:18 -0000 Delivered-To: apmail-avro-dev-archive@avro.apache.org Received: (qmail 83138 invoked by uid 500); 17 Feb 2012 21:37:18 -0000 Mailing-List: contact dev-help@avro.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@avro.apache.org Delivered-To: mailing list dev@avro.apache.org Received: (qmail 83125 invoked by uid 99); 17 Feb 2012 21:37:18 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 17 Feb 2012 21:37:18 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 17 Feb 2012 21:37:17 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 3864F1BD6AA for ; Fri, 17 Feb 2012 21:36:57 +0000 (UTC) Date: Fri, 17 Feb 2012 21:36:57 +0000 (UTC) From: "Doug Cutting (Commented) (JIRA)" To: dev@avro.apache.org Message-ID: <2094529290.52219.1329514617232.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (AVRO-672) Convert JSON Text Input to Avro Tool MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/AVRO-672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210586#comment-13210586 ] Doug Cutting commented on AVRO-672: ----------------------------------- Leith, is the tool that Ron provided here the one you need? If so, then we can probably resuscitate this patch and get it committed. If not, is there a specific tool you need (e.g., CSV or TSV)? Thanks! > Convert JSON Text Input to Avro Tool > ------------------------------------ > > Key: AVRO-672 > URL: https://issues.apache.org/jira/browse/AVRO-672 > Project: Avro > Issue Type: New Feature > Components: java > Reporter: Ron Bodkin > Attachments: AVRO-672.patch, AVRO-672.patch > > > The attached patch allows reading a JSON-formatted text file in, converting to a conforming Avro text file, emitting one record per line, e.g., it can read this input file: > {"intval":12} > {"intval":-73,"strval":"hello, there!!"} > with this schema: > { "type":"record", "name":"TestRecord", "fields": [ {"name":"intval","type":"int"}, {"name":"strval","type":["string", "null"]}]} > returning valid Avro. This is different than the DataFileWriteTool, which would read in the following internal encoding: > {"intval":12,"strval":null} > {"intval":-73,"strval":{"string":"hello, there!!"}} > In general, the internal encodings used by Avro aren't natural when reading in JSON text that appears in the wild. Likewise, this utility allows changing invalid Avro identifier characters into an underscore, again to tolerate JSON that wasn't designed to be readable by Avro. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira