crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Champion,Mac" <Mac.Champ...@Cerner.com>
Subject Enhancement to CSV input format?
Date Tue, 05 May 2015 14:30:17 GMT
Some users of the CSV Input Format at Cerner had some issues with CSV files from clients where
there were stray, unescaped double-quotes inside of fields (ostensibly representing inches).
Some bureaucratic stuff prevented us from getting those files reliably cleaned up, so we brainstormed
and figured out a way to make the CSV Input Format able to ignore the stray quotes and pass
them forward to be handled by whatever parsing solution comes later. We are working on implementing
this into our copy of the input format and it seems to be working so far.

My question is, is this something that we should log a JIRA for and submit our work to Crunch
as well? It’s handy in our case, but the files are truly malformed and not following the
CSV standards. Should the CSVInputFormat have configurable options to be able to handle malformed
files and pass bad records forward, or is the current behavior (blow up and give some info
about where the bad records start) the way it truly should behave?

Thanks for your input,
Mac

CONFIDENTIALITY NOTICE This message and any included attachments are from Cerner Corporation
and are intended only for the addressee. The information contained in this message is confidential
and may constitute inside or non-public information under international, federal, or state
securities laws. Unauthorized forwarding, printing, copying, distribution, or use of such
information is strictly prohibited and may be unlawful. If you are not the addressee, please
promptly delete this message and notify the sender of the delivery error by e-mail or you
may call Cerner's corporate offices in Kansas City, Missouri, U.S.A at (+1) (816)221-1024.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message