hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Laxman <lakshman...@huawei.com>
Subject RE: Bulk loading a CSV file into HBase
Date Fri, 09 Mar 2012 08:20:55 GMT
Hi Anil,

> instead of invoking "parser.parse(opts, args, true);" if somehow we can
> invoke "parser.parse(opts, args, false);" then all will be good. I
> haven't
> looked at the api to know about the possibility of same.

Changing to parser.parse(opts, args, false) solves this problem.
I think, we need to consider the following before going for this change.

This involves behavior change in legacy hadoop code.
Directly changing from true to false may cause behavioral compatibility
issue.

Also, Setting it to false may not be correct all the times.

Case #1 java
"java -Dprop1=val1 <Class> arg1 arg2" is different from "java <Class> arg1
arg2 -Dprop1=val1

In this case it looks like parser.parse(opts, args, true) is correct


Case #2 linux
"ls -l /home" is same as "ls /home -l"

In this case it looks like parser.parse(opts, args, false) is correct

>> This is probably too late IIRC
Hope, Stack also meant the same point here.

> Could you please tell me the meaning of "IIRC"?
IIRC - If I Recall/Remember Correctly 

--
Regards,
Laxman

> -----Original Message-----
> From: anilgupta84@gmail.com [mailto:anilgupta84@gmail.com] On Behalf Of
> anil gupta
> Sent: Friday, March 09, 2012 3:12 AM
> To: user@hbase.apache.org
> Subject: Re: Bulk loading a CSV file into HBase
> 
> Yeah after digging further into the code: Line#374 in
> GenericOptionsParser.java "commandLine = parser.parse(opts, args,
> true);"
> is the culprit. Nice find, Shrijeet. That answers my question. :)
> 
> Stack:
> Could you please tell me the meaning of "IIRC"? Updating the document
> is
> good but as per the behavior of parse() other -D option will also be
> ignored if  tablename is followed by any -D option .
> Duplicating the GOP functionality does not seems to be a good idea .
> Maybe
> instead of invoking "parser.parse(opts, args, true);" if somehow we can
> invoke "parser.parse(opts, args, false);" then all will be good. I
> haven't
> looked at the api to know about the possibility of same. This is just
> food
> for thought.
> 
> Thanks,
> Anil
> 
> 
> 
> On Thu, Mar 8, 2012 at 12:06 PM, Shrijeet Paliwal
> <shrijeet@rocketfuel.com>wrote:
> 
> > GenericOptionsParser stops parsing the arguments as soon as first non
> > option is specified (refer :
> >
> > http://commons.apache.org/cli/api-
> 1.2/org/apache/commons/cli/Parser.html#parse(org.apache.commons.cli.Opt
> ions
> > ,
> > java.lang.String[], boolean))
> >
> > So in this cases as soon parses sees the table name arg , it ignore
> all
> > other properties specified with -D opt. Note it not only ignores
> separator
> > it is also ignoring importtsv.skip.bad.lines option in your run which
> > failed.
> >
> >
> >
> > On Thu, Mar 8, 2012 at 11:27 AM, Stack <stack@duboce.net> wrote:
> >
> > > On Thu, Mar 8, 2012 at 11:14 AM, anil gupta <anilgupt@buffalo.edu>
> > wrote:
> > > > 1. Update the HBase bulk load documentation and specify that
> separator
> > > > argument should be next to program name.
> > >
> > > This would help.
> > >
> > > > 2. Fix the problem in the code itself by handling the separator
> > argument
> > > > explicitly. (Still, i am wondering why only separator value is
> not
> > being
> > > > set in jobconf automatically if it is not provided next to
> program
> > > name??)
> > > >
> > >
> > > This is probably too late IIRC.  I haven't looked at code but
> > > GenericOptionsParser has probably already been run by the time the
> > > application starts to process args.  Duplicating what GOP in the
> > > application is probably not the way to go either?
> > >
> > > St.Ack
> > >
> >
> 
> 
> 
> --
> Thanks & Regards,
> Anil Gupta


Mime
View raw message