hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adam Kawa <kawa.a...@gmail.com>
Subject Re: listing a 530k files directory
Date Wed, 09 Jul 2014 14:16:33 GMT
You can try snakebite https://github.com/spotify/snakebite.

$ snakebite ls -R <path>

I just run it to list 705K files and it went fine.



2014-05-30 20:42 GMT+02:00 Harsh J <harsh@cloudera.com>:

> The HADOOP_OPTS gets overriden by HADOOP_CLIENT_OPTS for FsShell
> utilities. The right way to extend is to use HADOOP_CLIENT_OPTS
> instead, for FsShell and other client applications such as "hadoop
> fs"/"hdfs dfs"/"hadoop jar", etc..
>
> On Fri, May 30, 2014 at 6:13 PM, bharath vissapragada
> <bharathvissapragada1990@gmail.com> wrote:
> > Hi Guido,
> >
> > You can set client side heap in HADOOP_OPTS variable before running the
> ls
> > command.
> >
> > export HADOOP_OPTS="-Xmx3g"; hadoop fs -ls /
> >
> > - Bharath
> >
> >
> > On Fri, May 30, 2014 at 5:22 PM, Guido Serra <zeph@fsfe.org> wrote:
> >>
> >> Hi,
> >> do you have an idea on how to look at the content of a 530k-files HDFS
> >> folder?
> >> (yes, I know it is a bad idea to have such setup, but that’s the status
> >> and I’d like to debug it)
> >> and the only tool that doesn’t go out of memory is "hdfs dfs -count
> >> folder/“
> >>
> >> -ls goes out of memory, -count with the folder/* goes out of memory …
> >> I’d like at least at the first 10 file names, see the size, maybe open
> one
> >>
> >> thanks,
> >> G.
> >
> >
>
>
>
> --
> Harsh J
>

Mime
View raw message