hawq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From tzolov <...@git.apache.org>
Subject [GitHub] incubator-hawq pull request: HAWQ-178: Add JSON plugin support in ...
Date Mon, 01 Feb 2016 10:23:13 GMT
Github user tzolov commented on the pull request:

    https://github.com/apache/incubator-hawq/pull/302#issuecomment-177895864
  
    @hornn , @GodenYao 
    The `pxf-json` code was implemented by @adamjshook. In this PR i've barely ported it to
the HAWQ pxf project structure. My idea was to port the existing code and then improve it
if needed.
    But the excellent comments above made me review the code and find a significant issue
with the JsonRecordReader  - e.g. the multiline JSON objects support (also called Pretty Print
- PP). Current implementation will not work when the JSON documents spans multiple HDFS Splits!

    So i will remove the multiline-JSON code from the PR, leaving in only the LineRecordReader
version (e.g assuming that json object per line). 
    Also i will open a discussion in the dev mailing list about how to handle in PXF documents
that spans across Splits. 



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

Mime
View raw message