hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shrijeet Paliwal <shrij...@rocketfuel.com>
Subject Re: Bulk loading a CSV file into HBase
Date Thu, 08 Mar 2012 20:06:50 GMT
GenericOptionsParser stops parsing the arguments as soon as first non
option is specified (refer :
http://commons.apache.org/cli/api-1.2/org/apache/commons/cli/Parser.html#parse(org.apache.commons.cli.Options,
java.lang.String[], boolean))

So in this cases as soon parses sees the table name arg , it ignore all
other properties specified with -D opt. Note it not only ignores separator
it is also ignoring importtsv.skip.bad.lines option in your run which
failed.



On Thu, Mar 8, 2012 at 11:27 AM, Stack <stack@duboce.net> wrote:

> On Thu, Mar 8, 2012 at 11:14 AM, anil gupta <anilgupt@buffalo.edu> wrote:
> > 1. Update the HBase bulk load documentation and specify that separator
> > argument should be next to program name.
>
> This would help.
>
> > 2. Fix the problem in the code itself by handling the separator argument
> > explicitly. (Still, i am wondering why only separator value is not being
> > set in jobconf automatically if it is not provided next to program
> name??)
> >
>
> This is probably too late IIRC.  I haven't looked at code but
> GenericOptionsParser has probably already been run by the time the
> application starts to process args.  Duplicating what GOP in the
> application is probably not the way to go either?
>
> St.Ack
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message