sqoop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Abraham Elmahrek <...@cloudera.com>
Subject Re: Sqoop2 inserts new line in output file
Date Thu, 09 Oct 2014 07:08:29 GMT
I really don't see a SQL file. It's possible that it's getting removed by
the mailing list. Any chance you could get a hexdump of your data? For
instance, with MySQL you could use "mysqldump ... | hexdump"

On Wed, Oct 8, 2014 at 11:24 PM, shakun grover <s28sweet@gmail.com> wrote:

> Hi Abe,
>
> I have attached the dump as sql file. You can execute this sql file.
>
> Thanks ,
> Shakun
>
> On Thu, Oct 9, 2014 at 12:50 AM, Abraham Elmahrek <abe@cloudera.com>
> wrote:
>
>> Hey there,
>>
>> There's usually a control character or something that causes this
>> behavior.
>>
>> I don't see the source data hexdump attachment. Could you please reattach?
>>
>> -Abe
>>
>> On Wed, Oct 8, 2014 at 5:29 AM, shakun grover <s28sweet@gmail.com> wrote:
>>
>> > Even when I view view this data in Hive, it takes
>> > 1,'James','A','Bond','557502533','(681) 675-8580','james@gmail.com','
>> > www.google.com','110 Campus Dr. Berkeley CA 94111
>> > as first column of first row
>> > then
>> > ','San Jose','CA','94500','USA' as first column of second row.. Other
>> > columns are given value as NULL
>> >
>> > On Wed, Oct 8, 2014 at 4:38 PM, shakun grover <s28sweet@gmail.com>
>> wrote:
>> >
>> > > Hi Abe,
>> > >
>> > > I have attached the sample data with this mail.
>> > >
>> > > This is the job that I created to import this data to HDFS:
>> > >
>> > > *Job:*
>> > > Name: testEmp
>> > >
>> > > Database configuration
>> > >
>> > > Schema name:
>> > > Table name:
>> > > Table SQL statement: select * from test.emp WHERE ${CONDITIONS}
>> > > Table column names:
>> > > Partition column name: id
>> > > Nulls in partition column:
>> > > Boundary query:
>> > >
>> > > Output configuration
>> > >
>> > > Storage type:
>> > >   0 : HDFS
>> > > Choose:
>> > > Output format:
>> > >   0 : TEXT_FILE
>> > >   1 : SEQUENCE_FILE
>> > > Choose: 0
>> > > Output directory: /tmp/emp/1
>> > >
>> > > Throttling resources
>> > >
>> > > Extractors:
>> > > Loaders:
>> > > Job was successfully updated with status FINE
>> > >
>> > > When I view the data with the below mentioned command:
>> > > *hadoop fs -cat /tmp/emp/p**
>> > > *It shows me data as follows:(*It inserts line break after 110 Campus
>> Dr.
>> > > Berkeley CA 94111)
>> > >
>> > > 1,'James','A','Bond','557502533','(681) 675-8580','james@gmail.com','
>> > > www.google.com','110 Campus Dr. Berkeley CA 94111
>> > > ','San Jose','CA','94500','USA'
>> > > 2,'James','A','Bond','557502533','(681) 675-8580','james@gmail.com','
>> > > www.google.com','110 Campus Dr. Berkeley CA 94111
>> > > ','San Jose','CA','94500','USA'
>> > > 3,'James','A','Bond','557502533','(681) 675-8580','james@gmail.com','
>> > > www.google.com','110 Campus Dr. Berkeley CA 94111
>> > > ','San Jose','CA','94500','USA'
>> > > 4,'James','A','Bond','557502533','(681) 675-8580','james@gmail.com','
>> > > www.google.com','110 Campus Dr. Berkeley CA 94111
>> > > ','San Jose','CA','94500','USA'
>> > > 5,'James','A','Bond','557502533','(681) 675-8580','james@gmail.com','
>> > > www.google.com','110 Campus Dr. Berkeley CA 94111
>> > > ','San Jose','CA','94500','USA'
>> > >
>> > >
>> > >
>> > >
>> > > On Wed, Oct 8, 2014 at 1:03 AM, Abraham Elmahrek <abe@cloudera.com>
>> > wrote:
>> > >
>> > >> Could we take a peek at your data from its source as hex?
>> > >>
>> > >> -Abe
>> > >>
>> > >> On Tue, Oct 7, 2014 at 3:46 AM, shakun grover <s28sweet@gmail.com>
>> > wrote:
>> > >>
>> > >> > Yes, that's correct that Sqoop2 should insert new lines at the
end
>> of
>> > a
>> > >> > records.
>> > >> > But if that record has many columns say (>15) columns in a
record,
>> > then
>> > >> > after few columns, it inserts a new line .
>> > >> >
>> > >> > Example:
>> > >> > 1,'346088103340400','3410 9240 5550
>> > >> > 778','3710-1690-2390-472','537436268','537 43 6268
>> > >> >
>> > >> > ','537-43-6268
>> > >> >
>> > >> > ','6816758580
>> > >> >
>> > >> > ','681 675 8580
>> > >> >
>> > >> > ','681-675-8580
>> > >> >
>> > >> > ','(681) 675-8580
>> > >> >
>> > >> > ','(681)675-8580
>> > >> >
>> > >> > ','1617547959','12.215.42.19
>> > >> >
>> > >> > ','','1132286141
>> > >> >
>> > >> > ','https://blu162.mail.live.com
>> > >> >
>> > >> > ','110 Campus Dr. Berkeley CA 94111
>> > >> >
>> > >> > ','James
>> > >> >
>> > >> > '
>> > >> > This is one record which got imported to HDFS in the above
>> mentioned
>> > >> > format. After 6th column it inserted a new line and then after
each
>> > >> column,
>> > >> > it inserted new line. Though this behavior of inserting new lines
>> is
>> > >> not
>> > >> > same in all the cases.
>> > >> > It inserts new lines randomly after nth column.
>> > >> >
>> > >> >
>> > >> > On Thu, Oct 2, 2014 at 1:12 AM, Abraham Elmahrek <abe@cloudera.com
>> >
>> > >> wrote:
>> > >> >
>> > >> > > Hey there,
>> > >> > >
>> > >> > > Sqoop2 should insert new lines at the end of a record. In
fact,
>> > Sqoop2
>> > >> > > should just write CSV. Could you copy/paste an example with
>> Schema?
>> > >> > >
>> > >> > > -Abe
>> > >> > >
>> > >> > > On Tue, Sep 30, 2014 at 11:32 PM, shakun grover <
>> s28sweet@gmail.com
>> > >
>> > >> > > wrote:
>> > >> > >
>> > >> > > > Hi All,
>> > >> > > >
>> > >> > > > When I import many columns(say >20 columns) from
RDBMS to HDFS,
>> > then
>> > >> > > Sqoop2
>> > >> > > > inserts a new line in the output file.The newline appears
at
>> the
>> > >> end of
>> > >> > > > certain fields.Doesn't seem to appear for every single
field.
>> > >> > > >
>> > >> > > > Can you please tell me why this new line is inserted?
And is
>> there
>> > >> any
>> > >> > > way
>> > >> > > > to avoid this?
>> > >> > > >
>> > >> > > > Thanks in advance!!
>> > >> > > >
>> > >> > > >
>> > >> > > > --
>> > >> > > > Thanks & Regards,
>> > >> > > > Shakun Grover
>> > >> > > >
>> > >> > >
>> > >> >
>> > >> >
>> > >> >
>> > >> > --
>> > >> > Thanks & Regards,
>> > >> > Shakun Grover
>> > >> >
>> > >>
>> > >
>> > >
>> > >
>> > > --
>> > > Thanks & Regards,
>> > > Shakun Grover
>> > >
>> >
>> >
>> >
>> > --
>> > Thanks & Regards,
>> > Shakun Grover
>> >
>>
>
>
>
> --
> Thanks & Regards,
> Shakun Grover
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message