hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dave Revell <d...@urbanairship.com>
Subject Re: local bulk loading?
Date Thu, 26 Apr 2012 22:29:26 GMT
Hi Doug,

When I hit this problem, I concluded that HFileOutputFormat cannot be used
in standalone mode since it requires DistributedCache, which doesn't work
with the local job runner.

So you're not the only one :(

-Dave

On Thu, Apr 26, 2012 at 1:52 PM, Doug Meil <doug.meil@explorysmedical.com>wrote:

>
> Hi Devs-
>
> I'm coding up a local bulkloading example for the RefGuide but I've been
> banging my head on this….
>
>
>  WARN [Thread-8] (LocalJobRunner.java:295) - job_local_0001
>
> java.lang.IllegalArgumentException: Can't read partitions file
>
> at
> org.apache.hadoop.hbase.mapreduce.hadoopbackport.TotalOrderPartitioner.setConf(TotalOrderPartitioner.java:111)
>
> at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:62)
>
> at
> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:117)
>
> at
> org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:552)
>
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:631)
>
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:315)
>
> at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
>
> Caused by: java.io.FileNotFoundException: File _partition.lst does not
> exist.
>
> at
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:372)
>
> at
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251)
>
> at org.apache.hadoop.fs.FileSystem.getLength(FileSystem.java:751)
>
> at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1424)
>
> at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1419)
>
> at
> org.apache.hadoop.hbase.mapreduce.hadoopbackport.TotalOrderPartitioner.readPartitions(TotalOrderPartitioner.java:296)
>
> at
> org.apache.hadoop.hbase.mapreduce.hadoopbackport.TotalOrderPartitioner.setConf(TotalOrderPartitioner.java:82)
>
> … does bulk loading work with the local job runner?  Obviously, you're not
> going to run a production cluster off your laptop but it's nice to at least
> be able to test your code.
>
> I know the DistributedCache doesn't work with the LocalJobRunner (and
> TotalOrderPartitioner uses the DistributedCache) and then there's this log
> message..
>
>
>  WARN [main] (LocalJobRunner.java:134) - LocalJobRunner does not support
> symlinking into current working dir.
>
> … so I'm wondering how this actually works, if it does work locally.
>
> Coincidentally, this exact error is in the troubleshooting chapter..
>
> http://hbase.apache.org/book.html#trouble.mapreduce
>
> … but it came up in a different context.  In the context that the guy was
> asking the question he thought he was remote, but he was really local.
>
> Doug Meil
> Chief Software Architect, Explorys
> doug.meil@explorys.com
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message