hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Meng Mao <meng...@gmail.com>
Subject Re: do HDFS files starting with _ (underscore) have special properties?
Date Fri, 02 Sep 2011 21:37:10 GMT
Is there a programmatic way to access these hidden files then?

On Fri, Sep 2, 2011 at 5:20 PM, Edward Capriolo <edlinuxguru@gmail.com>wrote:

> On Fri, Sep 2, 2011 at 4:04 PM, Meng Mao <mengmao@gmail.com> wrote:
>
> > We have a compression utility that tries to grab all subdirs to a
> directory
> > on HDFS. It makes a call like this:
> > FileStatus[] subdirs = fs.globStatus(new Path(inputdir, "*"));
> >
> > and handles files vs dirs accordingly.
> >
> > We tried to run our utility against a dir containing a computed SOLR
> shard,
> > which has files that look like this:
> > -rw-r--r--   2 hadoopuser visible 8538430603 2011-09-01 18:58
> > /test/output/solr-20110901165238/part-00000/data/index/_ox.fdt
> > -rw-r--r--   2 hadoopuser visible  233396596 2011-09-01 18:57
> > /test/output/solr-20110901165238/part-00000/data/index/_ox.fdx
> > -rw-r--r--   2 hadoopuser visible        130 2011-09-01 18:57
> > /test/output/solr-20110901165238/part-00000/data/index/_ox.fnm
> > -rw-r--r--   2 hadoopuser visible 2147948283 2011-09-01 18:55
> > /test/output/solr-20110901165238/part-00000/data/index/_ox.frq
> > -rw-r--r--   2 hadoopuser visible   87523726 2011-09-01 18:57
> > /test/output/solr-20110901165238/part-00000/data/index/_ox.nrm
> > -rw-r--r--   2 hadoopuser visible  920936168 2011-09-01 18:57
> > /test/output/solr-20110901165238/part-00000/data/index/_ox.prx
> > -rw-r--r--   2 hadoopuser visible   22619542 2011-09-01 18:58
> > /test/output/solr-20110901165238/part-00000/data/index/_ox.tii
> > -rw-r--r--   2 hadoopuser visible 2070214402 2011-09-01 18:51
> > /test/output/solr-20110901165238/part-00000/data/index/_ox.tis
> > -rw-r--r--   2 hadoopuser visible         20 2011-09-01 18:51
> > /test/output/solr-20110901165238/part-00000/data/index/segments.gen
> > -rw-r--r--   2 hadoopuser visible        282 2011-09-01 18:55
> > /test/output/solr-20110901165238/part-00000/data/index/segments_2
> >
> >
> > The globStatus call seems only able to pick up those last 2 files; the
> > several files that start with _ don't register.
> >
> > I've skimmed the FileSystem and GlobExpander source to see if there's
> > anything related to this, but didn't see it. Google didn't turn up
> anything
> > about underscores. Am I misunderstanding something about the regex
> patterns
> > needed to pick these up or unaware of some filename convention in HDFS?
> >
>
> Files starting with '_' are considered 'hidden' like unix files starting
> with '.'. I did not know that for a very long time because not everyone
> follows this rule or even knows about it.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message