spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ewan Leith <>
Subject spark-csv package - output to filename.csv?
Date Thu, 03 Sep 2015 15:04:39 GMT
Using the spark-csv package or outputting to text files, you end up with files named:


rather than a more user-friendly "test.csv", even if there's only 1 part file.

We can merge the files using the Hadoop merge command with something like this code from

def merge(sc: SparkContext, srcPath: String, dstPath: String): Unit = {

    val srcFileSystem = FileSystem.get(new URI(srcPath), sc.hadoopConfiguration)

    val dstFileSystem = FileSystem.get(new URI(dstPath), sc.hadoopConfiguration)

    dstFileSystem.delete(new Path(dstPath), true)

    FileUtil.copyMerge(srcFileSystem, new Path(srcPath), dstFileSystem, new Path(dstPath),
true, sc.hadoopConfiguration, null)


but does anyone know a way without dropping down to Hadoop.fs code?


View raw message