hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Azuryy Yu <azury...@gmail.com>
Subject Re: Missing records from HDFS
Date Fri, 22 Nov 2013 11:19:33 GMT
I do think this is because of your RecorderReader, can you paste your code
here? and give a piece of data example.

please use pastebin if you want.


On Fri, Nov 22, 2013 at 7:16 PM, ZORAIDA HIDALGO SANCHEZ <zoraida@tid.es>wrote:

>  One more thing,
>
>  if we split the files then all the records are processed. Files are
> of 70,5MB.
>
>  Thanks,
>
>  Zoraida.-
>
>   De: zoraida <zoraida@tid.es>
> Fecha: viernes, 22 de noviembre de 2013 08:59
>
> Para: "user@hadoop.apache.org" <user@hadoop.apache.org>
> Asunto: Re: Missing records from HDFS
>
>   Thanks for your response Azuryy.
>
>  My hadoop version: 2.0.0-cdh4.3.0
> InputFormat: a custom class that extends from FileInputFormat(csv input
> format)
> These fiels are under the same directory, different files.
> My input path is configured using oozie throughout the propertie
> mapred.input.dir.
>
>
>  Same code and input running on Hadoop 2.0.0-cdh4.2.1 works fine. Does
> not discard any record.
>
>  Thanks.
>
>   De: Azuryy Yu <azuryyyu@gmail.com>
> Responder a: "user@hadoop.apache.org" <user@hadoop.apache.org>
> Fecha: jueves, 21 de noviembre de 2013 07:31
> Para: "user@hadoop.apache.org" <user@hadoop.apache.org>
> Asunto: Re: Missing records from HDFS
>
>   what's your hadoop version? and which InputFormat are you used?
>
>  these files under one directory or there are lots of subdirectory? how
> ddi you configure input path in your main?
>
>
>
> On Thu, Nov 21, 2013 at 12:25 AM, ZORAIDA HIDALGO SANCHEZ <zoraida@tid.es>wrote:
>
>>  Hi all,
>>
>>  my job is not reading all the input records. In the input directory I
>> have a set of files containing a total of 6000000 records but only 5997000
>> are processed. The Map Input Records counter says 5997000.
>> I have tried downloading the files with a getmerge to check how many
>> records would return but the correct number is returned(6000000).
>>
>>  Do you have any suggestion?
>>
>>  Thanks.
>>
>> ------------------------------
>>
>> Este mensaje se dirige exclusivamente a su destinatario. Puede consultar
>> nuestra política de envío y recepción de correo electrónico en el enlace
>> situado más abajo.
>> This message is intended exclusively for its addressee. We only send and
>> receive email on the basis of the terms set out at:
>> http://www.tid.es/ES/PAGINAS/disclaimer.aspx
>>
>
>
> ------------------------------
>
> Este mensaje se dirige exclusivamente a su destinatario. Puede consultar
> nuestra política de envío y recepción de correo electrónico en el enlace
> situado más abajo.
> This message is intended exclusively for its addressee. We only send and
> receive email on the basis of the terms set out at:
> http://www.tid.es/ES/PAGINAS/disclaimer.aspx
>

Mime
View raw message