hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stanley Shi <s...@pivotal.io>
Subject Re: Appending to HDFS file
Date Thu, 28 Aug 2014 06:18:16 GMT
You should not use this method:
FSDataOutputStream fp = fs.create(pt, true)

Here's the java doc for this "create" method:

  /**

   * Create an FSDataOutputStream at the indicated Path.

   * @param f the file to create

   * @*param** overwrite if a file with this name already exists, then if
true,*

   *   the file will be overwritten, and if false an exception will be
thrown.

   */

  public FSDataOutputStream create(Path f, boolean overwrite)

      throws IOException {

    return create(f, overwrite,

                  getConf().getInt("io.file.buffer.size", 4096),

                  getDefaultReplication(f),

                  getDefaultBlockSize(f));

  }


On Wed, Aug 27, 2014 at 2:12 PM, rab ra <rabmdu@gmail.com> wrote:

>
> hello
>
> Here is d code snippet, I use to append
>
> def outFile = "${outputFile}.txt"
>
> Path pt = new Path("${hdfsName}/${dir}/${outFile}")
>
> def fs = org.apache.hadoop.fs.FileSystem.get(configuration);
>
> FSDataOutputStream fp = fs.create(pt, true)
>
> fp << "${key} ${value}\n"
> On 27 Aug 2014 09:46, "Stanley Shi" <sshi@pivotal.io> wrote:
>
>> would you please past the code in the loop?
>>
>>
>> On Sat, Aug 23, 2014 at 2:47 PM, rab ra <rabmdu@gmail.com> wrote:
>>
>>> Hi
>>>
>>> By default, it is true in hadoop 2.4.1. Nevertheless, I have set it to
>>> true explicitly in hdfs-site.xml. Still, I am not able to achieve append.
>>>
>>> Regards
>>> On 23 Aug 2014 11:20, "Jagat Singh" <jagatsingh@gmail.com> wrote:
>>>
>>>> What is value of dfs.support.append in hdfs-site.cml
>>>>
>>>>
>>>> https://hadoop.apache.org/docs/r2.3.0/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml
>>>>
>>>>
>>>>
>>>>
>>>> On Sat, Aug 23, 2014 at 1:41 AM, rab ra <rabmdu@gmail.com> wrote:
>>>>
>>>>> Hello,
>>>>>
>>>>> I am currently using Hadoop 2.4.1.I am running a MR job using hadoop
>>>>> streaming utility.
>>>>>
>>>>> The executable needs to write large amount of information in a file.
>>>>> However, this write is not done in single attempt. The file needs to
be
>>>>> appended with streams of information generated.
>>>>>
>>>>> In the code, inside a loop, I open a file in hdfs, appends some
>>>>> information. This is not working and I see only the last write.
>>>>>
>>>>> How do I accomplish append operation in hadoop? Can anyone share a
>>>>> pointer to me?
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> regards
>>>>> Bala
>>>>>
>>>>
>>>>
>>
>>
>> --
>> Regards,
>> *Stanley Shi,*
>>
>>


-- 
Regards,
*Stanley Shi,*

Mime
View raw message