drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Julien Le Dem <jul...@dremio.com>
Subject Re: How to get started with a new format conversion and representation
Date Fri, 28 Aug 2015 17:45:21 GMT
let me know how it goes.

On Fri, Aug 28, 2015 at 10:44 AM, Julien Le Dem <julien@dremio.com> wrote:

> Hi Edmon,
> I would start with picking one of Avro, Thrift or Protobuf to describe a
> schema for this data:
> http://avro.apache.org/docs/current/#schemas
> https://developers.google.com/protocol-buffers/
> http://thrift.apache.org/docs/idl
>
> From there you can write to Parquet using the appropriate integration:
>
> https://github.com/apache/parquet-mr/blob/master/parquet-avro/src/test/java/org/apache/parquet/avro/TestSpecificReadWrite.java
>
> https://github.com/apache/parquet-mr/blob/master/parquet-protobuf/src/test/java/org/apache/parquet/proto/ProtoInputOutputFormatTest.java
>
> https://github.com/apache/parquet-mr/blob/master/parquet-thrift/src/test/java/org/apache/parquet/hadoop/thrift/TestInputOutputFormat.java
>
> Julien
>
> On Thu, Aug 27, 2015 at 7:23 PM, Edmon Begoli <ebegoli@gmail.com> wrote:
>
>> This might be more of a question for Parquet folks here than Drill-ers,
>> but
>> nevertheless:
>>
>> I would like to be able to convert EDI HL7 v.2 messages into Parquet
>> representation, and make them amenable to Drill querying.
>> (Here is a sample claim message 837p in HL7 representation (page 8):
>> http://www.vitahealth.org/Modules/ShowDocument2.aspx?documentid=545 )
>>
>> This is a lengthy topic which I could discuss in details, but for now I
>> would like to just know where and how to get started.
>>
>> Thank you,
>> Edmon
>>
>
>
>
> --
> Julien
>



-- 
Julien

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message