hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chalcy Raja <Chalcy.R...@careerbuilder.com>
Subject sqoop question - could not post- the message came back undeplivered
Date Thu, 29 Mar 2012 13:46:36 GMT
I am trying to do a sqoop export (data from hdfs hadoop to database). The table I am trying
to export has 2 million rows.  The table has 20 fields. The sqoop command is successful if
I did 10 rows till 95 rows.  When I try anything more than 95, the sqoop export fails with
the following error.

By googling I get that this a dbms limitation. Is there anyway to configure to fix this error?
 I am surprised that it works for few rows.
 
Any help is appreciated.

Thanks,
CHalcy
 
12/03/29 09:00:59 INFO mapred.JobClient: Task Id : attempt_201203230811_0539_m_000000_0, Status
: FAILED
java.io.IOException: com.microsoft.sqlserver.jdbc.SQLServerException: The incoming tabular
data stream (TDS) remote procedure call (RPC) protocol stream is incorrect. Too many parameters
were provided in this RPC request. The maximum is 2100.
        at com.cloudera.sqoop.mapreduce.AsyncSqlRecordWriter.close(AsyncSqlRecordWriter.java:189)
        at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:540)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:649)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
        at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1157)
        at org.apache.hadoop.mapred.Child.main(Child.java:264)
Caused by: com.microsoft.sqlserver.jdbc.SQLServerException: The incoming tabular data stream
(TDS) remote procedure call (RPC) protocol stream is incorrect.
12/03/29 09:01:05 INFO mapred.JobClient: Task Id : attempt_201203230811_0539_m_000000_1, Status
: FAILED
java.io.IOException: com.microsoft.sqlserver.jdbc.SQLServerException: The incoming tabular
data stream (TDS) remote procedure call (RPC) protocol stream is incorrect. Too many parameters
were provided in this RPC request. The maximum is 2100.
        at com.cloudera.sqoop.mapreduce.AsyncSqlRecordWriter.close(AsyncSqlRecordWriter.java:189)

-----Original Message-----
From: Thiruvel Thirumoolan [mailto:thiruvel@yahoo-inc.com] 
Sent: Thursday, March 29, 2012 7:55 AM
To: user@hive.apache.org; hive-user@hadoop.apache.org
Subject: Re: Executing query and storing output on HDFS

This should help.

https://cwiki.apache.org/Hive/languagemanual-dml.html#LanguageManualDML-Wri
tingdataintofilesystemfromqueries


On 3/29/12 4:48 PM, "Paul Ingles" <paul@oobaloo.co.uk> wrote:

>Hi,
>
>I'd like to be able to execute a Hive query and for the output to be 
>stored in a path on HDFS (rather than immediately returned by the 
>client). Ultimately I'd like to be able to do this to integrate some of 
>our Hive statements and Cascading flows.
>
>Does anyone know if this is possible? I could have sworn it was but 
>can't find any mention of some OUTPUT TO clause on the Hive Wiki.
>
>Many thanks,
>Paul



Mime
View raw message