From notifications-return-155226-archive-asf-public=cust-asf.ponee.io@asterixdb.apache.org Fri Apr 17 11:01:13 2020 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by mx-eu-01.ponee.io (Postfix) with SMTP id D19BE180647 for ; Fri, 17 Apr 2020 13:01:12 +0200 (CEST) Received: (qmail 27419 invoked by uid 500); 17 Apr 2020 11:01:12 -0000 Mailing-List: contact notifications-help@asterixdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@asterixdb.apache.org Delivered-To: mailing list notifications@asterixdb.apache.org Received: (qmail 27408 invoked by uid 99); 17 Apr 2020 11:01:12 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 17 Apr 2020 11:01:12 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id BDB37C1D07 for ; Fri, 17 Apr 2020 11:01:11 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -112.19 X-Spam-Level: X-Spam-Status: No, score=-112.19 tagged_above=-999 required=6.31 tests=[ENV_AND_HDR_SPF_MATCH=-0.5, KAM_ASCII_DIVIDERS=0.8, KAM_DMARC_STATUS=0.01, RCVD_IN_DNSWL_HI=-5, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, USER_IN_DEF_SPF_WL=-7.5, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-he-de.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id gvZti6Dqg15E for ; Fri, 17 Apr 2020 11:01:09 +0000 (UTC) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=207.244.88.153; helo=mail.apache.org; envelope-from=jira@apache.org; receiver= Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by mx1-he-de.apache.org (ASF Mail Server at mx1-he-de.apache.org) with SMTP id CDD717FB8D for ; Fri, 17 Apr 2020 11:01:08 +0000 (UTC) Received: (qmail 27124 invoked by uid 99); 17 Apr 2020 11:01:02 -0000 Received: from mailrelay1-us-west.apache.org (HELO mailrelay1-us-west.apache.org) (209.188.14.139) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 17 Apr 2020 11:01:02 +0000 Received: from jira-he-de.apache.org (static.172.67.40.188.clients.your-server.de [188.40.67.172]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id C9891E0F34 for ; Fri, 17 Apr 2020 11:01:01 +0000 (UTC) Received: from jira-he-de.apache.org (localhost.localdomain [127.0.0.1]) by jira-he-de.apache.org (ASF Mail Server at jira-he-de.apache.org) with ESMTP id 91BBB7808FB for ; Fri, 17 Apr 2020 11:01:00 +0000 (UTC) Date: Fri, 17 Apr 2020 11:01:00 +0000 (UTC) From: "Hussain Towaileb (Jira)" To: notifications@asterixdb.incubator.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (ASTERIXDB-2717) Querying external dataset fails when further data read is needed before decoding MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/ASTERIXDB-2717?page=3Dcom.atla= ssian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hussain Towaileb updated ASTERIXDB-2717: ---------------------------------------- Description:=20 In some scenarios, some characters are represented by multiple bytes, and t= he bytes are partially read (rest of the bytes is in the next read). In thi= s case, the decoder will fail to decode, and will wait for the next read to= have all the bytes together then try to decode it. =C2=A0 In the above scenario, the ByteBuffer limit is not being reset before readi= ng again, which could lead to the position being greater than the limit and= throwing an exception. =C2=A0 For example: org.apache.hyracks.api.exceptions.HyracksDataException: java.lang.IllegalAr= gumentException: newPosition > limit: (4096 > 1048) at org.apache.hyracks.api.exceptions.HyracksDataException.create(Hy= racksDataException.java:51) ~[hyracks-api.jar:6.6.0-11077] at org.apache.asterix.external.dataflow.RecordDataFlowController.st= art(RecordDataFlowController.java:59) ~[asterix-external-data.jar:6.6.0-110= 77] at org.apache.asterix.external.dataset.adapter.GenericAdapter.start= (GenericAdapter.java:36) ~[asterix-external-data.jar:6.6.0-11077] at org.apache.asterix.external.operators.ExternalScanOperatorDescri= ptor$1.initialize(ExternalScanOperatorDescriptor.java:62) ~[asterix-externa= l-data.jar:6.6.0-11077] at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNod= ePushable.lambda$runInParallel$8(SuperActivityOperatorNodePushable.java:228= ) ~[hyracks-api.jar:6.6.0-11077] at java.util.concurrent.FutureTask.run(Unknown Source) [?:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source= ) [?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Sourc= e) [?:?] at java.lang.Thread.run(Unknown Source) [?:?] Caused by: java.lang.IllegalArgumentException: newPosition > limit: (4096 >= 1048) at java.nio.Buffer.createPositionException(Unknown Source) ~[?:?] at java.nio.Buffer.position(Unknown Source) ~[?:?] at java.nio.ByteBuffer.position(Unknown Source) ~[?:?] at java.nio.ByteBuffer.position(Unknown Source) ~[?:?] at org.apache.asterix.external.input.stream.AsterixInputStreamReade= r.read(AsterixInputStreamReader.java:103) ~[asterix-external-data.jar:6.6.0= -11077] at org.apache.asterix.external.input.stream.AsterixInputStreamReade= r.read(AsterixInputStreamReader.java:67) ~[asterix-external-data.jar:6.6.0-= 11077] at org.apache.asterix.external.input.record.reader.stream.LineRecor= dReader.hasNext(LineRecordReader.java:101) ~[asterix-external-data.jar:6.6.= 0-11077] at org.apache.asterix.external.dataflow.RecordDataFlowController.st= art(RecordDataFlowController.java:49) ~[asterix-external-data.jar:6.6.0-110= 77] ... 7 more was: In some scenarios, some characters are represented by multiple bytes, and t= he bytes are partially read (rest of the bytes is in the next read). In thi= s case, the decoder will fail to decode, and will wait for the next read to= have all the bytes together then try to decode it. =C2=A0 In the above scenario, the ByteBuffer limit is not being reset before readi= ng again, which could lead to the position being greater than the limit and= throwing an exception. =C2=A0 For example: org.apache.hyracks.api.exceptions.HyracksDataException: java.lang.IllegalAr= gumentException: newPosition > limit: (4096 > 1048) > Querying external dataset fails when further data read is needed before d= ecoding > -------------------------------------------------------------------------= ------- > > Key: ASTERIXDB-2717 > URL: https://issues.apache.org/jira/browse/ASTERIXDB-2717 > Project: Apache AsterixDB > Issue Type: Bug > Components: EXT - External data > Affects Versions: 0.9.5 > Reporter: Hussain Towaileb > Assignee: Hussain Towaileb > Priority: Major > Fix For: 0.9.5 > > > In some scenarios, some characters are represented by multiple bytes, and= the bytes are partially read (rest of the bytes is in the next read). In t= his case, the decoder will fail to decode, and will wait for the next read = to have all the bytes together then try to decode it. > =C2=A0 > In the above scenario, the ByteBuffer limit is not being reset before rea= ding again, which could lead to the position being greater than the limit a= nd throwing an exception. > =C2=A0 > For example: > org.apache.hyracks.api.exceptions.HyracksDataException: java.lang.Illegal= ArgumentException: newPosition > limit: (4096 > 1048) > at org.apache.hyracks.api.exceptions.HyracksDataException.create(= HyracksDataException.java:51) ~[hyracks-api.jar:6.6.0-11077] > at org.apache.asterix.external.dataflow.RecordDataFlowController.= start(RecordDataFlowController.java:59) ~[asterix-external-data.jar:6.6.0-1= 1077] > at org.apache.asterix.external.dataset.adapter.GenericAdapter.sta= rt(GenericAdapter.java:36) ~[asterix-external-data.jar:6.6.0-11077] > at org.apache.asterix.external.operators.ExternalScanOperatorDesc= riptor$1.initialize(ExternalScanOperatorDescriptor.java:62) ~[asterix-exter= nal-data.jar:6.6.0-11077] > at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorN= odePushable.lambda$runInParallel$8(SuperActivityOperatorNodePushable.java:2= 28) ~[hyracks-api.jar:6.6.0-11077] > at java.util.concurrent.FutureTask.run(Unknown Source) [?:?] > at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Sour= ce) [?:?] > at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Sou= rce) [?:?] > at java.lang.Thread.run(Unknown Source) [?:?] > Caused by: java.lang.IllegalArgumentException: newPosition > limit: (4096= > 1048) > at java.nio.Buffer.createPositionException(Unknown Source) ~[?:?] > at java.nio.Buffer.position(Unknown Source) ~[?:?] > at java.nio.ByteBuffer.position(Unknown Source) ~[?:?] > at java.nio.ByteBuffer.position(Unknown Source) ~[?:?] > at org.apache.asterix.external.input.stream.AsterixInputStreamRea= der.read(AsterixInputStreamReader.java:103) ~[asterix-external-data.jar:6.6= .0-11077] > at org.apache.asterix.external.input.stream.AsterixInputStreamRea= der.read(AsterixInputStreamReader.java:67) ~[asterix-external-data.jar:6.6.= 0-11077] > at org.apache.asterix.external.input.record.reader.stream.LineRec= ordReader.hasNext(LineRecordReader.java:101) ~[asterix-external-data.jar:6.= 6.0-11077] > at org.apache.asterix.external.dataflow.RecordDataFlowController.= start(RecordDataFlowController.java:49) ~[asterix-external-data.jar:6.6.0-1= 1077] > ... 7 more -- This message was sent by Atlassian Jira (v8.3.4#803005)