camel-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Thorburn <nzi...@gmail.com>
Subject Re: Handling complex, multi-record, single-line pipe-separated text?
Date Wed, 20 Nov 2013 03:38:26 GMT
I have, though it appears they don't support it out of the box.
Unfortunate, but unsurprising.

Thanks,

- Andrew

On Thu, Nov 14, 2013 at 11:28 PM, Claus Ibsen <claus.ibsen@gmail.com> wrote:
> Hi
>
> Have you been in touch with beanio project? Maybe there is something
> they would like to see support out of the box in beanio.
>
> Otherwise you may need to write your own code to format the date accordingly.
>
>
>
> On Thu, Nov 14, 2013 at 3:05 AM, Andrew Thorburn <nzipsi@gmail.com> wrote:
>> First up, I'm currently building a proxy, effectively, for some
>> interfaces that I have to work with. On one side is a set of web
>> services that my application will be calling, so that I can
>> standardise on *something*. On the other side is a set of IBM MQ
>> queues that - mostly - require fixed-length records. So what I am
>> doing is sending a SOAP message to ServiceMix, transforming that into
>> a POJO, then transforming that POJO into a flat file via BeanIO, which
>> in turn gets sent out to MQ. That might seem a little inefficient, but
>> it beats writing thousands of lines of XSL to transform the XML into a
>> flat file.
>>
>> One format in particular, however, doesn't seem to be achievable with
>> any of the data formats available in Camel, as far as I can tell, but
>> I would appreciate some advice on this front. Bindy cannot - it isn't
>> nearly comprehensive enough. BeanIO *almost* can, if only the records
>> were on separate lines. But they're not - they're all on the same
>> line. Neither Flatpack nor Smooks seem to handle this format either,
>> from what I've read.
>>
>> The basic format is like a CSV file except with pipes " | " instead of
>> commas. However, despite being only a single line, there are multiple
>> records in that line, and several of the records have a variable
>> number of repetitions.
>>
>> Now, given that I only need to *generate* this format, not parse it, I
>> could probably generate it with BeanIO and do some sort of
>> post-processing on it to strip out the newlines or something similar
>> (the response is significantly simpler and contains no repeating
>> records - parsing that is not a problem). However, I would like to
>> know if there is anything out there which would support this format
>> properly, should I find it necessary to parse it in the future, and to
>> avoid hacking in something now which I will later regret.
>>
>> For example, if we take the following sample (trimmed down
>> significantly for brevity):
>>
>> COM|US|CORP|FIELD1|DATE1|TIME1|DATE2|DATE3|TYPE|1|A|ABC123|DEF456||SURNAME|FIRSTNAME|MIDDLENAME|GENDER|DOB|ADDRESS1|ADDRESS2|ADDRESS3|ADDRESS4|ADDRESS5|A|ABC123|DEF456||SURNAME|FIRSTNAME|MIDDLENAME|GENDER|DOB|ADDRESS1|ADDRESS2|ADDRESS3|ADDRESS4|ADDRESS5|A|ABC123|DEF456||SURNAME|FIRSTNAME|MIDDLENAME|GENDER|DOB|ADDRESS1|ADDRESS2|ADDRESS3|ADDRESS4|ADDRESS5|S|1|STUFF||CODE||ESTATUS||G|ID|||SURNAME|FIRSTNAME|MIDDLENAME|GENDER|DOB|ADDRESS1|ADDRESS2|ADDRESS3|G|ID|||SURNAME|FIRSTNAME|MIDDLENAME|GENDER|DOB|ADDRESS1|ADDRESS2|ADDRESS3|G|ID|||SURNAME|FIRSTNAME|MIDDLENAME|GENDER|DOB|ADDRESS1|ADDRESS2|ADDRESS3|I|...|R|...|R|...|R|...|Y|...|Y|...|Y|...
>>
>> And break that down, the first set of columns,
>> COM|US|CORP|FIELD1|DATE1|TIME1|DATE2|DATE3|TYPE|1, is effectively a
>> header. This appears exactly once and is not an issue.
>>
>> The next set of columns,
>> A|ABC123|DEF456||SURNAME|FIRSTNAME|MIDDLENAME|GENDER|DOB|ADDRESS1|ADDRESS2|ADDRESS3|ADDRESS4|ADDRESS5,
>> represents a single record that can appear 1 to 99 times in the line.
>> This where I start seeing problems. In my POJO, I would like to
>> represent this as a list of "A" records, and then have the data format
>> generate one record for each list item, but without adding a
>> line-break afterwards. If I were to parse this, it would need to know
>> that after the first record, the "COM" record, it should look at the
>> first character to see what the type of the record is - in this case
>> it is an "A" record, and there must be at least one record, and there
>> may be up to 99 records. In the example, I have repeated it three
>> times. Note that while BeanIO could, in theory, handle this as a
>> repeating segment, I have other repeating segments following this one.
>>
>> The next set of columns, S|1|STUFF||CODE||ESTATUS||, is repeated
>> exactly once, and the type of record is "S", identified by the first
>> character.
>>
>> The next set, G|ID|||SURNAME|FIRSTNAME|MIDDLENAME|GENDER|DOB|ADDRESS1|ADDRESS2|ADDRESS3,
>> is similar to the "A" record, but can appear 0 to 99 times. I have
>> included it three times in this example.
>>
>> The next set, I|..., is similar to "S" in that it only appears once.
>>
>> The next set, R|..., can appear 0 to 99 times.
>>
>> The next set, Y|..., can appear 0 to 99 times.
>>
>> There are other repeating segments too - that's just a small part of
>> the whole record.
>>
>> I hope this makes sense - it seems like this is a particularly unusual
>> record format to have to deal with, so it is perhaps unsurprising that
>> I can't find a tool that will handle it.
>
>
>
> --
> Claus Ibsen
> -----------------
> Red Hat, Inc.
> Email: cibsen@redhat.com
> Twitter: davsclaus
> Blog: http://davsclaus.com
> Author of Camel in Action: http://www.manning.com/ibsen

Mime
View raw message