asterixdb-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yingyi Bu (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (ASTERIXDB-1340) Index does not have a valid resource ID
Date Sat, 12 Mar 2016 19:05:03 GMT

     [ https://issues.apache.org/jira/browse/ASTERIXDB-1340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Yingyi Bu resolved ASTERIXDB-1340.
----------------------------------
    Resolution: Fixed

> Index does not have a valid resource ID
> ---------------------------------------
>
>                 Key: ASTERIXDB-1340
>                 URL: https://issues.apache.org/jira/browse/ASTERIXDB-1340
>             Project: Apache AsterixDB
>          Issue Type: Bug
>          Components: AsterixDB, Storage
>            Reporter: Yingyi Bu
>            Assignee: Murtadha Hubail
>            Priority: Critical
>         Attachments: asterix-configuration.xml, lineitem.tbl, local3.xml, orders.tbl
>
>
> I created a 3 NC cluster on a single machine, using the attached cluster configuration
(local3.xml) and instance configuration (asterix-configuration.xml).  The CSV files for the
datasets are attached. Then I ran the following query.
> DDL:
> {noformat}
> drop dataverse tpch if exists;
> create dataverse tpch;
> use dataverse tpch;
> create type LineItemType as closed {
>   l_orderkey: int64,
>   l_partkey: int64,
>   l_suppkey: int64,
>   l_linenumber: int64,
>   l_quantity: int64,
>   l_extendedprice: double,
>   l_discount: double,
>   l_tax: double,
>   l_returnflag: string,
>   l_linestatus: string,
>   l_shipdate: string,
>   l_commitdate: string,
>   l_receiptdate: string,
>   l_shipinstruct: string,
>   l_shipmode: string,
>   l_comment: string
> }
> create type OrderType as closed {
>   o_orderkey: int64,
>   o_custkey: int64,
>   o_orderstatus: string,
>   o_totalprice: double,
>   o_orderdate: string,
>   o_orderpriority: string,
>   o_clerk: string,
>   o_shippriority: int64,
>   o_comment: string
> }
> create dataset LineItem(LineItemType)
>   primary key l_orderkey, l_linenumber;
> create dataset Orders(OrderType)
>   primary key o_orderkey;
> {noformat}
> DML:
> {noformat}
> use dataverse tpch;
> load dataset LineItem 
> using "org.apache.asterix.external.dataset.adapter.NCFileSystemAdapter"
> (("path"="asterix_nc1:///data/lineitem.tbl"),("format"="delimited-text"),("delimiter"="|"));
> load dataset Orders 
> using "org.apache.asterix.external.dataset.adapter.NCFileSystemAdapter"
> (("path"="asterix_nc1:///data/orders.tbl"),("format"="delimited-text"),("delimiter"="|"));
> {noformat}
> Query:
> {noformat}
> use dataverse tpch;
> declare function tmp()
> {
>   for $l in dataset('LineItem')
>   where $l.l_commitdate < $l.l_receiptdate
>   distinct by $l.l_orderkey
>   return { "o_orderkey": $l.l_orderkey }
> }
> for $o in dataset('Orders')
> for $t in tmp()
> where $o.o_orderkey = $t.o_orderkey and 
>   $o.o_orderdate >= '1993-07-01' and $o.o_orderdate < '1993-10-01' 
> group by $o_orderpriority := $o.o_orderpriority with $o
> order by $o_orderpriority
> return {
>   "order_priority": $o_orderpriority,
>   "count": count($o)
> }
> {noformat}
> The query fails with the following exception:
> {noformat}
> org.apache.hyracks.api.exceptions.HyracksDataException: java.util.concurrent.ExecutionException:
org.apache.hyracks.api.exceptions.HyracksDataException: Index does not have a valid resource
ID. Has it been created yet?
>         at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.runInParallel(SuperActivityOperatorNodePushable.java:218)
>         at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.initialize(SuperActivityOperatorNodePushable.java:83)
>         at org.apache.hyracks.control.nc.Task.run(Task.java:261)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
> Caused by: java.util.concurrent.ExecutionException: org.apache.hyracks.api.exceptions.HyracksDataException:
Index does not have a valid resource ID. Has it been created yet?
>         at java.util.concurrent.FutureTask.report(FutureTask.java:122)
>         at java.util.concurrent.FutureTask.get(FutureTask.java:192)
>         at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.runInParallel(SuperActivityOperatorNodePushable.java:212)
>         ... 5 more
> Caused by: org.apache.hyracks.api.exceptions.HyracksDataException: Index does not have
a valid resource ID. Has it been created yet?
>         at org.apache.hyracks.storage.am.common.dataflow.IndexDataflowHelper.open(IndexDataflowHelper.java:108)
>         at org.apache.hyracks.storage.am.common.dataflow.IndexSearchOperatorNodePushable.open(IndexSearchOperatorNodePushable.java:111)
>         at org.apache.hyracks.algebricks.runtime.operators.std.EmptyTupleSourceRuntimeFactory$1.open(EmptyTupleSourceRuntimeFactory.java:51)
>         at org.apache.hyracks.algebricks.runtime.operators.meta.AlgebricksMetaOperatorDescriptor$1.initialize(AlgebricksMetaOperatorDescriptor.java:109)
>         at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.lambda$initialize$0(SuperActivityOperatorNodePushable.java:83)
>         at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable$$Lambda$4/1452854179.runAction(Unknown
Source)
>         at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable$1.call(SuperActivityOperatorNodePushable.java:205)
>         at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable$1.call(SuperActivityOperatorNodePushable.java:202)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         ... 3 more
> {noformat}
> It seems the issue is related to the "distinct by". I have tried the following query
also and it works:
> {noformat}
> use dataverse tpch;
> declare function tmp()
> {
>   for $l in dataset('LineItem')
>   where $l.l_commitdate < $l.l_receiptdate
>   group by $l_orderkey := $l.l_orderkey with $l
>   return { "o_orderkey": $l_orderkey }
> }
> for $o in dataset('Orders')
> for $t in tmp()
> where $o.o_orderkey = $t.o_orderkey and 
>   $o.o_orderdate >= '1993-07-01' and $o.o_orderdate < '1993-10-01' 
> group by $o_orderpriority := $o.o_orderpriority with $o
> order by $o_orderpriority
> return {
>   "order_priority": $o_orderpriority,
>   "count": count($o)
> }
> {noformat}
> But I have no clue why "distinct by" is related to the resource ID.
> Also, the original query works when I only have two NCs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message