asterixdb-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Yingyi Bu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ASTERIXDB-1340) Index does not have a valid resource ID
Date Fri, 11 Mar 2016 02:15:41 GMT

    [ https://issues.apache.org/jira/browse/ASTERIXDB-1340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15190363#comment-15190363
] 

Yingyi Bu commented on ASTERIXDB-1340:
--------------------------------------

I did more trials. It seems the issue is related to the instance name, regardless of the number
of nodes.
1. The first asterixdb instance I created on the machine is called "test".
2. Then, if the instance name is not "test", the query doesn't work but other queries still
work..
3. Using "managix shutdown" in-between the creation of instances doesn't help.

It seems the resource ID depends on sth. that zookeeper keeps track of...
Or there might be sth. that managix doesn't clean up competely.

> Index does not have a valid resource ID
> ---------------------------------------
>
>                 Key: ASTERIXDB-1340
>                 URL: https://issues.apache.org/jira/browse/ASTERIXDB-1340
>             Project: Apache AsterixDB
>          Issue Type: Bug
>          Components: AsterixDB, Storage
>            Reporter: Yingyi Bu
>            Assignee: Murtadha Hubail
>            Priority: Critical
>         Attachments: asterix-configuration.xml, lineitem.tbl, local3.xml, orders.tbl
>
>
> I created a 3 NC cluster on a single machine, using the attached cluster configuration
(local3.xml) and instance configuration (asterix-configuration.xml).  The CSV files for the
datasets are attached. Then I ran the following query.
> DDL:
> {noformat}
> drop dataverse tpch if exists;
> create dataverse tpch;
> use dataverse tpch;
> create type LineItemType as closed {
>   l_orderkey: int64,
>   l_partkey: int64,
>   l_suppkey: int64,
>   l_linenumber: int64,
>   l_quantity: int64,
>   l_extendedprice: double,
>   l_discount: double,
>   l_tax: double,
>   l_returnflag: string,
>   l_linestatus: string,
>   l_shipdate: string,
>   l_commitdate: string,
>   l_receiptdate: string,
>   l_shipinstruct: string,
>   l_shipmode: string,
>   l_comment: string
> }
> create type OrderType as closed {
>   o_orderkey: int64,
>   o_custkey: int64,
>   o_orderstatus: string,
>   o_totalprice: double,
>   o_orderdate: string,
>   o_orderpriority: string,
>   o_clerk: string,
>   o_shippriority: int64,
>   o_comment: string
> }
> create dataset LineItem(LineItemType)
>   primary key l_orderkey, l_linenumber;
> create dataset Orders(OrderType)
>   primary key o_orderkey;
> {noformat}
> DML:
> {noformat}
> use dataverse tpch;
> load dataset LineItem 
> using "org.apache.asterix.external.dataset.adapter.NCFileSystemAdapter"
> (("path"="asterix_nc1:///data/lineitem.tbl"),("format"="delimited-text"),("delimiter"="|"));
> load dataset Orders 
> using "org.apache.asterix.external.dataset.adapter.NCFileSystemAdapter"
> (("path"="asterix_nc1:///data/orders.tbl"),("format"="delimited-text"),("delimiter"="|"));
> {noformat}
> Query:
> {noformat}
> use dataverse tpch;
> declare function tmp()
> {
>   for $l in dataset('LineItem')
>   where $l.l_commitdate < $l.l_receiptdate
>   distinct by $l.l_orderkey
>   return { "o_orderkey": $l.l_orderkey }
> }
> for $o in dataset('Orders')
> for $t in tmp()
> where $o.o_orderkey = $t.o_orderkey and 
>   $o.o_orderdate >= '1993-07-01' and $o.o_orderdate < '1993-10-01' 
> group by $o_orderpriority := $o.o_orderpriority with $o
> order by $o_orderpriority
> return {
>   "order_priority": $o_orderpriority,
>   "count": count($o)
> }
> {noformat}
> The query fails with the following exception:
> {noformat}
> org.apache.hyracks.api.exceptions.HyracksDataException: java.util.concurrent.ExecutionException:
org.apache.hyracks.api.exceptions.HyracksDataException: Index does not have a valid resource
ID. Has it been created yet?
>         at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.runInParallel(SuperActivityOperatorNodePushable.java:218)
>         at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.initialize(SuperActivityOperatorNodePushable.java:83)
>         at org.apache.hyracks.control.nc.Task.run(Task.java:261)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
> Caused by: java.util.concurrent.ExecutionException: org.apache.hyracks.api.exceptions.HyracksDataException:
Index does not have a valid resource ID. Has it been created yet?
>         at java.util.concurrent.FutureTask.report(FutureTask.java:122)
>         at java.util.concurrent.FutureTask.get(FutureTask.java:192)
>         at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.runInParallel(SuperActivityOperatorNodePushable.java:212)
>         ... 5 more
> Caused by: org.apache.hyracks.api.exceptions.HyracksDataException: Index does not have
a valid resource ID. Has it been created yet?
>         at org.apache.hyracks.storage.am.common.dataflow.IndexDataflowHelper.open(IndexDataflowHelper.java:108)
>         at org.apache.hyracks.storage.am.common.dataflow.IndexSearchOperatorNodePushable.open(IndexSearchOperatorNodePushable.java:111)
>         at org.apache.hyracks.algebricks.runtime.operators.std.EmptyTupleSourceRuntimeFactory$1.open(EmptyTupleSourceRuntimeFactory.java:51)
>         at org.apache.hyracks.algebricks.runtime.operators.meta.AlgebricksMetaOperatorDescriptor$1.initialize(AlgebricksMetaOperatorDescriptor.java:109)
>         at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable.lambda$initialize$0(SuperActivityOperatorNodePushable.java:83)
>         at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable$$Lambda$4/1452854179.runAction(Unknown
Source)
>         at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable$1.call(SuperActivityOperatorNodePushable.java:205)
>         at org.apache.hyracks.api.rewriter.runtime.SuperActivityOperatorNodePushable$1.call(SuperActivityOperatorNodePushable.java:202)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         ... 3 more
> {noformat}
> It seems the issue is related to the "distinct by". I have tried the following query
also and it works:
> {noformat}
> use dataverse tpch;
> declare function tmp()
> {
>   for $l in dataset('LineItem')
>   where $l.l_commitdate < $l.l_receiptdate
>   group by $l_orderkey := $l.l_orderkey with $l
>   return { "o_orderkey": $l_orderkey }
> }
> for $o in dataset('Orders')
> for $t in tmp()
> where $o.o_orderkey = $t.o_orderkey and 
>   $o.o_orderdate >= '1993-07-01' and $o.o_orderdate < '1993-10-01' 
> group by $o_orderpriority := $o.o_orderpriority with $o
> order by $o_orderpriority
> return {
>   "order_priority": $o_orderpriority,
>   "count": count($o)
> }
> {noformat}
> But I have no clue why "distinct by" is related to the resource ID.
> Also, the original query works when I only have two NCs.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message