hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Subramanian, Sanjay (HQP)" <sanjay.subraman...@roberthalf.com>
Subject Re: Load CSV files with embedded map and arrays to Hive
Date Fri, 22 Aug 2014 20:58:09 GMT
Hey Sushant

I looked at the CSV file. The rows are not JSON.

>From my limited understanding of what you are looking for , my suggestion would be to
have the input as rows of JSONs instead

Example of the first row from your data set will be

{"employee":{"name":"John Doe","salary":100000,"subordinates":[{"name":"Mary Smith"},{"name":"Todd
Jones"}],"deductions":[{"type":"Federal Taxes","amount":0.2},{"type":"State Taxes","amount":0.05},{"type":"Insurance","amount":0.1}],"address":{"street":"1
Michigan Ave.","city":"Chicago","state":"IL","zip":60600}}}


If u r running Ubuntu / Mac
U can see the JSON beautified by following command

cat your_json_file   | python -m json.tool | less

Once u have this in this format , I would recommend u use JSON-SERDE to define the hive table
https://github.com/rcongiu/Hive-JSON-Serde

U can see some JSON serde examples here
http://bigdatalatte.wordpress.com/2014/08/21/denormalizing-json-arrays-in-hive/
http://thornydev.blogspot.com/2013/07/querying-json-records-via-hive.html

{
  "employee": {
    "name": "John Doe",
    "salary": 100000,
    "subordinates": [
      {
        "name": "Mary Smith"
      },
      {
        "name": "Todd Jones"
      }
    ],
    "deductions": [
      {
        "type": "Federal Taxes",
        "amount": 0.2
      },
      {
        "type": "State Taxes",
        "amount": 0.05
      },
      {
        "type": "Insurance",
        "amount": 0.1
      }
    ],
    "address": {
      "street": "1 Michigan Ave.",
      "city": "Chicago",
      "state": "IL",
      "zip": 60600
    }
  }
}


Thanks

Warm Regards

sanjay


From: Nitin Pawar <nitinpawar432@gmail.com<mailto:nitinpawar432@gmail.com>>
Reply-To: "user@hive.apache.org<mailto:user@hive.apache.org>" <user@hive.apache.org<mailto:user@hive.apache.org>>
Date: Wednesday, August 20, 2014 at 11:34 PM
To: "user@hive.apache.org<mailto:user@hive.apache.org>" <user@hive.apache.org<mailto:user@hive.apache.org>>
Subject: Re: Load CSV files with embedded map and arrays to Hive

Hey sorry .. got stuck with work.
I will take a look today


On Wed, Aug 20, 2014 at 5:43 PM, Sushant Prusty <sushant.p@gmx.com<mailto:sushant.p@gmx.com>>
wrote:
Hi Nitin,
Hope you have received the dataset. If you have any further requirement, please feel free
to contact. Will appreciate your help.

Regards,
Sushant
On Tuesday 19 August 2014 02:33 PM, Nitin Pawar wrote:
can you give an example of your dataset?


On Tue, Aug 19, 2014 at 2:31 PM, Sushant Prusty <sushant.p@gmx.com<mailto:sushant.p@gmx.com>>
wrote:
Pl let me know how I can load a CSV file with embedded map and arrays data into Hive.

Regards,
Sushant



--
Nitin Pawar


--
Warm regards,

Sushant Prusty



--
Nitin Pawar

Mime
View raw message