crunch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Wills <jwi...@cloudera.com>
Subject Re: Ignoring crunch.disable.combine.file
Date Tue, 22 Jul 2014 16:22:51 GMT
Hey man,

You should be able to override the disable settings on the TextFileSource
you create with the inputConf method, so like:

From.textFile(somePath).inputConf("crunch.disable.combine.file", "true");

Let me know if that does the trick,
J


On Tue, Jul 22, 2014 at 12:39 AM, Das, Mridul <mridul.das@news.com.au>
wrote:

> To add, the constructor of the TextFileSource (all the sources for that
> matter) set the crunch.disable.combine.file to false, thereby overriding
> any value supplied by the user.
>
>
> On 22 July 2014 17:26, Das, Mridul <mridul.das@news.com.au> wrote:
>
> > Hi,
> >   While trying to read text files from S3 using Crunch(version is
> 0.10.0),
> > I get an error "INFO jobcontrol.CrunchControlledJob:
> > java.io.FileNotFoundException: File does not exist:"  This is probably
> > because of https://issues.apache.org/jira/browse/MAPREDUCE-2704.
> >  However when I try and disable combining
> > files( crunch.disable.combine.file=true), I can see it ignores the
> setting,
> > and creates CrunchCombineFileInputFormat.
> >
> >
> >
> >
> > --
> >  Mridul Das
> >  Data Engineer
> >  Level 4, 2 Holt Street Surry Hills NSW 2010
> > T +61 2 8114 7621 M +61 478 977 665
> > E mridul.das@news.com.au W www.NewsCorpAustralia.com
> >  Proudly supporting 1 degree <http://www.1degree.net.au>, A News Corp
> > Australia initiative.
> >  [image: News Corp Australia]
> >
>
>
>
> --
>  Mridul Das
>  Data Engineer
>  Level 4, 2 Holt Street Surry Hills NSW 2010
> T +61 2 8114 7621 M +61 478 977 665
> E mridul.das@news.com.au W www.NewsCorpAustralia.com
>  Proudly supporting 1 degree <http://www.1degree.net.au>, A News Corp
> Australia initiative.
>  [image: News Corp Australia]
>
> --
> This message and its attachments may contain legally privileged or
> confidential information. It is intended solely for the named addressee. If
> you are not the addressee indicated in this message or responsible for
> delivery of the message to the addressee, you may not copy or deliver this
> message or its attachments to anyone. Rather, you should permanently delete
> this message and its attachments and kindly notify the sender by reply
> e-mail. Any content of this message and its attachments which does not
> relate to the official business of the sending company must be taken not to
> have been sent or endorsed by that company or any of its related entities.
> No warranty is made that the e-mail or attachments are free from computer
> virus or other defect.
>



-- 
Director of Data Science
Cloudera <http://www.cloudera.com>
Twitter: @josh_wills <http://twitter.com/josh_wills>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message