hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jakob Homan <jghoman....@gmail.com>
Subject Re: Table schema size limit to 4000 chars ?
Date Tue, 22 Jan 2013 22:20:14 GMT
There shouldn't be any problems with comments in Avro schemas.  You
just need to make sure they're escaped properly.  We did run into a
problem with schema.literal values longer than 4k (the size of the
backing mysql varchar field), so internally we just bump this value
for our Hive installs:

ALTER TABLE SERDE_PARAMS MODIFY PARAM_VALUE varchar(20000);


On 17 December 2012 05:58, Alexandre Fouche
<alexandre.fouche@cleverscale.com> wrote:
> Ah, it seems the Json parser issue was due to my avro schema having comments
> //. I have seen some comments on the web about this parser that it can be
> configured to accept comments.
>
> Is there a Hive property to be passed to json parser and allow comments in
> Avro schemas ?
>
> --
> Alexandre Fouche
>
> On Monday 17 December 2012 at 14:24, Alexandre Fouche wrote:
>
> Hi,
>
> I have an avro table with a schema that is around 8000 chars, and cannot
> query from it:
>
> First i had issue when creating the table, Hive will throw an exception
> because the field in MySQL (varchar(4000)) is too small. So i altered the
> column to varchar(10000) and it fixed this part.
>
> But when querying the table, Hive throws an exception that the JsonParser
> can not find the end of the avro schema array. It is basically the same
> issue as above, the avro schema string is too long to be parsed by the 3rd
> party Json parser org.codehaus.jackson.JsonParser in Hive/Avro. There i do
> not really know if this parser cannot parse arbitrary length json strings or
> it has an hardcoded allocated string size
>
> Note i am using Cloudera Hive 0.9, which has avro serde bundled
>
> Here is the thrown exception. org.codehaus.jackson.JsonParser is mentioned
> at the end
>
> (…)
> 12/12/17 10:49:55 WARN avro.AvroSerdeUtils: Encountered exception
> determining schema. Returning signal schema to indicate problem
> org.apache.avro.SchemaParseException:
> org.codehaus.jackson.JsonParseException: Unexpected end-of-input: expected
> close marker for ARRAY (from [Source: java.io.StringReader@a750bb9; line: 1,
> column: 37])
>  at [Source: java.io.StringReader@a750bb9; line: 1, column: 13980]
> at org.apache.avro.Schema$Parser.parse(Schema.java:983)
> at org.apache.avro.Schema$Parser.parse(Schema.java:971)
> at org.apache.avro.Schema.parse(Schema.java:1020)
> at
> org.apache.hadoop.hive.serde2.avro.AvroSerdeUtils.determineSchemaOrThrowException(AvroSerdeUtils.java:61)
> at
> org.apache.hadoop.hive.serde2.avro.AvroSerdeUtils.determineSchemaOrReturnErrorSchema(AvroSerdeUtils.java:87)
> at
> org.apache.hadoop.hive.serde2.avro.AvroSerDe.initialize(AvroSerDe.java:59)
> at
> org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:203)
> at
> org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:260)
> at org.apache.hadoop.hive.ql.metadata.Table.getDeserializer(Table.java:253)
> at org.apache.hadoop.hive.ql.metadata.Table.getCols(Table.java:490)
> at org.apache.hadoop.hive.ql.metadata.Table.checkValidity(Table.java:162)
> at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:930)
> at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:831)
> at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:959)
> at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:7532)
> at
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:246)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:432)
> at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:337)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:906)
> at
> org.apache.hive.service.cli.operation.SQLOperation.run(SQLOperation.java:94)
> at
> org.apache.hive.service.cli.session.Session.executeStatement(Session.java:141)
> at
> org.apache.hive.service.cli.CLIService.executeStatement(CLIService.java:120)
> at
> org.apache.hive.service.cli.thrift.ThriftCLIService.ExecuteStatement(ThriftCLIService.java:169)
> at
> org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1107)
> at
> org.apache.hive.service.cli.thrift.TCLIService$Processor$ExecuteStatement.getResult(TCLIService.java:1096)
> at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:32)
> at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34)
> at
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:176)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> Caused by: org.codehaus.jackson.JsonParseException: Unexpected end-of-input:
> expected close marker for ARRAY (from [Source: java.io.StringReader@a750bb9;
> line: 1, column: 37])
>  at [Source: java.io.StringReader@a750bb9; line: 1, column: 13980]
> at org.codehaus.jackson.JsonParser._constructError(JsonParser.java:1291)
> at
> org.codehaus.jackson.impl.JsonParserMinimalBase._reportError(JsonParserMinimalBase.java:385)
> at
> org.codehaus.jackson.impl.JsonParserMinimalBase._reportInvalidEOF(JsonParserMinimalBase.java:318)
> at
> org.codehaus.jackson.impl.JsonParserBase._handleEOF(JsonParserBase.java:354)
> at
> org.codehaus.jackson.impl.ReaderBasedParser._skipWSOrEnd(ReaderBasedParser.java:955)
> at
> org.codehaus.jackson.impl.ReaderBasedParser.nextToken(ReaderBasedParser.java:247)
> at
> org.codehaus.jackson.map.deser.BaseNodeDeserializer.deserializeArray(JsonNodeDeserializer.java:200)
> at
> org.codehaus.jackson.map.deser.BaseNodeDeserializer.deserializeAny(JsonNodeDeserializer.java:216)
> at
> org.codehaus.jackson.map.deser.BaseNodeDeserializer.deserializeObject(JsonNodeDeserializer.java:187)
> at
> org.codehaus.jackson.map.deser.BaseNodeDeserializer.deserializeAny(JsonNodeDeserializer.java:213)
> at
> org.codehaus.jackson.map.deser.JsonNodeDeserializer.deserialize(JsonNodeDeserializer.java:56)
> at
> org.codehaus.jackson.map.deser.JsonNodeDeserializer.deserialize(JsonNodeDeserializer.java:13)
> at org.codehaus.jackson.map.ObjectMapper._readValue(ObjectMapper.java:2383)
> at org.codehaus.jackson.map.ObjectMapper.readTree(ObjectMapper.java:1234)
> at org.codehaus.jackson.map.ObjectMapper.readTree(ObjectMapper.java:1209)
> at org.apache.avro.Schema$Parser.parse(Schema.java:981)
> ... 30 more
>
>
> --
> Alexandre Fouche
>
>

Mime
View raw message