sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From shakun grover <s28sw...@gmail.com>
Subject Re: Sqoop2 inserts new line in output file
Date Thu, 09 Oct 2014 06:24:51 GMT
Hi Abe,

I have attached the dump as sql file. You can execute this sql file.

Thanks ,
Shakun

On Thu, Oct 9, 2014 at 12:50 AM, Abraham Elmahrek <abe@cloudera.com> wrote:

> Hey there,
>
> There's usually a control character or something that causes this behavior.
>
> I don't see the source data hexdump attachment. Could you please reattach?
>
> -Abe
>
> On Wed, Oct 8, 2014 at 5:29 AM, shakun grover <s28sweet@gmail.com> wrote:
>
> > Even when I view view this data in Hive, it takes
> > 1,'James','A','Bond','557502533','(681) 675-8580','james@gmail.com','
> > www.google.com','110 Campus Dr. Berkeley CA 94111
> > as first column of first row
> > then
> > ','San Jose','CA','94500','USA' as first column of second row.. Other
> > columns are given value as NULL
> >
> > On Wed, Oct 8, 2014 at 4:38 PM, shakun grover <s28sweet@gmail.com>
> wrote:
> >
> > > Hi Abe,
> > >
> > > I have attached the sample data with this mail.
> > >
> > > This is the job that I created to import this data to HDFS:
> > >
> > > *Job:*
> > > Name: testEmp
> > >
> > > Database configuration
> > >
> > > Schema name:
> > > Table name:
> > > Table SQL statement: select * from test.emp WHERE ${CONDITIONS}
> > > Table column names:
> > > Partition column name: id
> > > Nulls in partition column:
> > > Boundary query:
> > >
> > > Output configuration
> > >
> > > Storage type:
> > >   0 : HDFS
> > > Choose:
> > > Output format:
> > >   0 : TEXT_FILE
> > >   1 : SEQUENCE_FILE
> > > Choose: 0
> > > Output directory: /tmp/emp/1
> > >
> > > Throttling resources
> > >
> > > Extractors:
> > > Loaders:
> > > Job was successfully updated with status FINE
> > >
> > > When I view the data with the below mentioned command:
> > > *hadoop fs -cat /tmp/emp/p**
> > > *It shows me data as follows:(*It inserts line break after 110 Campus
> Dr.
> > > Berkeley CA 94111)
> > >
> > > 1,'James','A','Bond','557502533','(681) 675-8580','james@gmail.com','
> > > www.google.com','110 Campus Dr. Berkeley CA 94111
> > > ','San Jose','CA','94500','USA'
> > > 2,'James','A','Bond','557502533','(681) 675-8580','james@gmail.com','
> > > www.google.com','110 Campus Dr. Berkeley CA 94111
> > > ','San Jose','CA','94500','USA'
> > > 3,'James','A','Bond','557502533','(681) 675-8580','james@gmail.com','
> > > www.google.com','110 Campus Dr. Berkeley CA 94111
> > > ','San Jose','CA','94500','USA'
> > > 4,'James','A','Bond','557502533','(681) 675-8580','james@gmail.com','
> > > www.google.com','110 Campus Dr. Berkeley CA 94111
> > > ','San Jose','CA','94500','USA'
> > > 5,'James','A','Bond','557502533','(681) 675-8580','james@gmail.com','
> > > www.google.com','110 Campus Dr. Berkeley CA 94111
> > > ','San Jose','CA','94500','USA'
> > >
> > >
> > >
> > >
> > > On Wed, Oct 8, 2014 at 1:03 AM, Abraham Elmahrek <abe@cloudera.com>
> > wrote:
> > >
> > >> Could we take a peek at your data from its source as hex?
> > >>
> > >> -Abe
> > >>
> > >> On Tue, Oct 7, 2014 at 3:46 AM, shakun grover <s28sweet@gmail.com>
> > wrote:
> > >>
> > >> > Yes, that's correct that Sqoop2 should insert new lines at the end
> of
> > a
> > >> > records.
> > >> > But if that record has many columns say (>15) columns in a record,
> > then
> > >> > after few columns, it inserts a new line .
> > >> >
> > >> > Example:
> > >> > 1,'346088103340400','3410 9240 5550
> > >> > 778','3710-1690-2390-472','537436268','537 43 6268
> > >> >
> > >> > ','537-43-6268
> > >> >
> > >> > ','6816758580
> > >> >
> > >> > ','681 675 8580
> > >> >
> > >> > ','681-675-8580
> > >> >
> > >> > ','(681) 675-8580
> > >> >
> > >> > ','(681)675-8580
> > >> >
> > >> > ','1617547959','12.215.42.19
> > >> >
> > >> > ','','1132286141
> > >> >
> > >> > ','https://blu162.mail.live.com
> > >> >
> > >> > ','110 Campus Dr. Berkeley CA 94111
> > >> >
> > >> > ','James
> > >> >
> > >> > '
> > >> > This is one record which got imported to HDFS in the above mentioned
> > >> > format. After 6th column it inserted a new line and then after each
> > >> column,
> > >> > it inserted new line. Though this behavior of inserting new lines
> is
> > >> not
> > >> > same in all the cases.
> > >> > It inserts new lines randomly after nth column.
> > >> >
> > >> >
> > >> > On Thu, Oct 2, 2014 at 1:12 AM, Abraham Elmahrek <abe@cloudera.com>
> > >> wrote:
> > >> >
> > >> > > Hey there,
> > >> > >
> > >> > > Sqoop2 should insert new lines at the end of a record. In fact,
> > Sqoop2
> > >> > > should just write CSV. Could you copy/paste an example with
> Schema?
> > >> > >
> > >> > > -Abe
> > >> > >
> > >> > > On Tue, Sep 30, 2014 at 11:32 PM, shakun grover <
> s28sweet@gmail.com
> > >
> > >> > > wrote:
> > >> > >
> > >> > > > Hi All,
> > >> > > >
> > >> > > > When I import many columns(say >20 columns) from RDBMS
to HDFS,
> > then
> > >> > > Sqoop2
> > >> > > > inserts a new line in the output file.The newline appears
at the
> > >> end of
> > >> > > > certain fields.Doesn't seem to appear for every single field.
> > >> > > >
> > >> > > > Can you please tell me why this new line is inserted? And
is
> there
> > >> any
> > >> > > way
> > >> > > > to avoid this?
> > >> > > >
> > >> > > > Thanks in advance!!
> > >> > > >
> > >> > > >
> > >> > > > --
> > >> > > > Thanks & Regards,
> > >> > > > Shakun Grover
> > >> > > >
> > >> > >
> > >> >
> > >> >
> > >> >
> > >> > --
> > >> > Thanks & Regards,
> > >> > Shakun Grover
> > >> >
> > >>
> > >
> > >
> > >
> > > --
> > > Thanks & Regards,
> > > Shakun Grover
> > >
> >
> >
> >
> > --
> > Thanks & Regards,
> > Shakun Grover
> >
>



-- 
Thanks & Regards,
Shakun Grover

Mime
  • Unnamed multipart/mixed (inline, None, 0 bytes)
View raw message