hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shahab Yunus <shahab.yu...@gmail.com>
Subject Re: Number of records in an HDFS file
Date Mon, 13 May 2013 18:27:25 GMT
Not terribly efficient but at the top of my head: GROUP ALL and then do a
COUNT (or COUNT (*). You can implement a follow-up script or add this in
the existing script once the file has been generated.

Regards,
Shahab


On Mon, May 13, 2013 at 2:16 PM, Mix Nin <pig.mixed@gmail.com> wrote:

> Ok, let re modify my requirement. I should have specified in the beginning
> itself.
>
> I need to get count of records in an HDFS file created by a PIG script and
> the store the count in a text file. This should be done automatically on a
> daily basis without manual intervention
>
>
> On Mon, May 13, 2013 at 11:13 AM, Rahul Bhattacharjee <
> rahul.rec.dgp@gmail.com> wrote:
>
>> How about the second approach , get the application/job id which the pig
>> creates and submits to cluster and then find the job output counter for
>> that job from the JT.
>>
>> Thanks,
>> Rahul
>>
>>
>> On Mon, May 13, 2013 at 11:37 PM, Mix Nin <pig.mixed@gmail.com> wrote:
>>
>>> It is a text file.
>>>
>>> If we want to use wc, we need to copy file from HDFS and then use wc,
>>> and this may take time. Is there a way without copying file from HDFS to
>>> local directory?
>>>
>>> Thanks
>>>
>>>
>>> On Mon, May 13, 2013 at 11:04 AM, Rahul Bhattacharjee <
>>> rahul.rec.dgp@gmail.com> wrote:
>>>
>>>> few pointers.
>>>>
>>>> what kind of files are we talking about. for text you can use wc , for
>>>> avro data files you can use avro-tools.
>>>>
>>>> or get the job that pig is generating , get the counters for that job
>>>> from the jt of your hadoop cluster.
>>>>
>>>> Thanks,
>>>>  Rahul
>>>>
>>>>
>>>> On Mon, May 13, 2013 at 11:21 PM, Mix Nin <pig.mixed@gmail.com> wrote:
>>>>
>>>>> Hello,
>>>>>
>>>>> What is the bets way to get the count of records in an HDFS file
>>>>> generated by a PIG script.
>>>>>
>>>>> Thanks
>>>>>
>>>>>
>>>>
>>>
>>
>

Mime
View raw message