hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From anil gupta <anilg...@buffalo.edu>
Subject Re: Bulk loading a CSV file into HBase
Date Thu, 08 Mar 2012 21:42:12 GMT
Yeah after digging further into the code: Line#374 in
GenericOptionsParser.java "commandLine = parser.parse(opts, args, true);"
is the culprit. Nice find, Shrijeet. That answers my question. :)

Stack:
Could you please tell me the meaning of "IIRC"? Updating the document is
good but as per the behavior of parse() other -D option will also be
ignored if  tablename is followed by any -D option .
Duplicating the GOP functionality does not seems to be a good idea . Maybe
instead of invoking "parser.parse(opts, args, true);" if somehow we can
invoke "parser.parse(opts, args, false);" then all will be good. I haven't
looked at the api to know about the possibility of same. This is just food
for thought.

Thanks,
Anil



On Thu, Mar 8, 2012 at 12:06 PM, Shrijeet Paliwal
<shrijeet@rocketfuel.com>wrote:

> GenericOptionsParser stops parsing the arguments as soon as first non
> option is specified (refer :
>
> http://commons.apache.org/cli/api-1.2/org/apache/commons/cli/Parser.html#parse(org.apache.commons.cli.Options
> ,
> java.lang.String[], boolean))
>
> So in this cases as soon parses sees the table name arg , it ignore all
> other properties specified with -D opt. Note it not only ignores separator
> it is also ignoring importtsv.skip.bad.lines option in your run which
> failed.
>
>
>
> On Thu, Mar 8, 2012 at 11:27 AM, Stack <stack@duboce.net> wrote:
>
> > On Thu, Mar 8, 2012 at 11:14 AM, anil gupta <anilgupt@buffalo.edu>
> wrote:
> > > 1. Update the HBase bulk load documentation and specify that separator
> > > argument should be next to program name.
> >
> > This would help.
> >
> > > 2. Fix the problem in the code itself by handling the separator
> argument
> > > explicitly. (Still, i am wondering why only separator value is not
> being
> > > set in jobconf automatically if it is not provided next to program
> > name??)
> > >
> >
> > This is probably too late IIRC.  I haven't looked at code but
> > GenericOptionsParser has probably already been run by the time the
> > application starts to process args.  Duplicating what GOP in the
> > application is probably not the way to go either?
> >
> > St.Ack
> >
>



-- 
Thanks & Regards,
Anil Gupta

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message