spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "AIT OUFKIR (JIRA)" <j...@apache.org>
Subject [jira] [Created] (SPARK-23495) Creating a json file using a dataframe creates an issue
Date Fri, 23 Feb 2018 14:00:00 GMT
AIT OUFKIR created SPARK-23495:
----------------------------------

             Summary: Creating a json file using a dataframe creates an issue
                 Key: SPARK-23495
                 URL: https://issues.apache.org/jira/browse/SPARK-23495
             Project: Spark
          Issue Type: Bug
          Components: PySpark
    Affects Versions: 2.1.0
            Reporter: AIT OUFKIR
             Fix For: 2.1.0


Issue happen when trying to create json file using a dataframe (see code below)

catis = ["CAT1","CAT2"]
constis = ["CONST1","CONST2","CONST3"]
datis = ["DAT1","DATE2","DATE3"]
dictis = \{'A':1, 'B':2}
dummis = ['dum1','dumm2','dumm3']
fifis = \{'fifi1':1, 'fifi2':2, 'fifi3':3}
khikhis = ['khikhi1','khikhi12','khikhi3','khikhi4']

metadata_dump = dict(cati=catis, consti=constis, dati=datis, dicti=dictis, khikhi=khikhis,
dummi=dummis, fifi=fifis)
md = sqlContext.createDataFrame([metadata_dump]).collect()
metadata = sqlContext.createDataFrame(md,['cati', 'consti', 'dati', 'dicti','khikhi', 'dummi',
'fifi'])

metadata_path = "/mypath"
metadata.write.mode('overwrite').json(metadata_path)

This gives the following Results :

{"cati":["CAT1","CAT2"]
,"consti":["CONST1","CONST2","CONST3"]
,"dati":["DAT1","DATE2","DATE3"]
,"dicti":\{"A":1,"B":2}
,"khikhi":["dum1","dumm2","dumm3"]
,"dummi":\{"fifi2":2,"fifi3":3,"fifi1":1}
,"fifi":["khikhi1","khikhi12","khikhi3","khikhi4"]}

Which is wrong

 

When I try switching the fifis dict and not putting it at the end of the dict metadata_dump
then I get the correct results :

 {
"cati":["CAT1","CAT2"]
,"consti":["CONST1","CONST2","CONST3"]
,"dati":["DAT1","DATE2","DATE3"]
,"dicti":\{"A":1,"B":2}
,"dummi":["dum1","dumm2","dumm3"]
,"fifi":\{"fifi2":2,"fifi3":3,"fifi1":1}
,"khikhi":["khikhi1","khikhi12","khikhi3","khikhi4"]
}

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org


Mime
View raw message