drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lucas Alvarez Argüero (JIRA) <j...@apache.org>
Subject [jira] [Created] (DRILL-5411) Getting 0 rows when there are more than 100000 in the mongoDB collection
Date Tue, 04 Apr 2017 09:42:42 GMT
Lucas Alvarez Argüero created DRILL-5411:
--------------------------------------------

             Summary: Getting 0 rows when there are more than 100000 in the mongoDB collection
                 Key: DRILL-5411
                 URL: https://issues.apache.org/jira/browse/DRILL-5411
             Project: Apache Drill
          Issue Type: Bug
          Components: Storage - MongoDB
    Affects Versions: 1.10.0
         Environment: VM1("ubuntu/trusty64"): mongo1
•	mongoS (mongo server)
•	MongoD shard1 (Primary, secondary,secondary)
•	Mongo config server 
•	Drillbit
VM2("ubuntu/trusty64"): mongo2
•	MongoD shard2 (Primary, secondary,secondary)
•	Mongo config server 
•	Drillbit
VM3("ubuntu/trusty64"): mongo3
•	MongoD shard3 (Primary, secondary,secondary)
•	Mongo config server 
•	Drillbit
VM4("ubuntu/trusty64"): zk1
•	Zookeeper in quorum
VM5("ubuntu/trusty64"): zk2
•	Zookeeper in quorum
VM6("ubuntu/trusty64"): zk3
•	Zookeeper in quorum

            Reporter: Lucas Alvarez Argüero


Getting 0 rows when there are more than 100000 in the mongoDB collection


Drills works perfectly when I am using mongo as storage when there are less than 100000(aprox)
documents in the collection (partitioned) but when there are more documents, drill return
zero rows but still can count all documents (but it can’t count documents using where).
Less than 100000:
select v.measInfo_id,v.endTime from mongo.mandarinaTime3.MeasValue v    limit 3; 
+--------------+-------------+
| measInfo_id  |   endTime   |
+--------------+-------------+
| [B@1a7d4b45  | 2016-09-19  |
| [B@17d8ac99  | 2016-09-19  |
| [B@122b7d0a  | 2016-09-19  |
+--------------+-------------+
3 rows selected (0.313 seconds)

More  than 100000:
0: jdbc:drill:> select v.measInfo_id,v.endTime from mongo.mandarinaTime3.MeasValue v  
 limit 3;                                                                        
+--------------+----------+
| measInfo_id  | endTime  |
+--------------+----------+
+--------------+----------+
No rows selected (0.341 seconds)

0: jdbc:drill:> select count() from mongo.mandarinaTime3.MeasValue v    ;             
          
+---------+
| EXPR$0  |
+---------+
| 502068  |
+---------+
1 row selected (0.426 seconds)

0: jdbc:drill:> select count() from mongo.mandarinaTime3.MeasValue v    Where endtime='2016-09-19';
+---------+
| EXPR$0  |
+---------+
| 0       |
+---------+
1 row selected (0.98 seconds)



If the collection isn’t partitioned, drill also works perfectly

drill mongo plugin:
{
  "type": "mongo",
  "connection": "mongodb://mongo1:27017/",
  "enabled": true
}
mongo sharded collection:

 {  "_id" : "mandarinaTime3",  "primary" : "b",  "partitioned" : true }
                mandarinaTime3.MeasCollecFile
                        shard key: { "_id" : 1 }
                        unique: false
                        balancing: true
                        chunks:
                                b       1
                        { "_id" : { "$minKey" : 1 } } -->> { "_id" : { "$maxKey" : 1
} } on : b Timestamp(1, 0) 
                mandarinaTime3.MeasInfo
                        shard key: { "_id" : 1 }
                        unique: false
                        balancing: true
                        chunks:
                                a       1
                                b       1
                                c       1
                        { "_id" : { "$minKey" : 1 } } -->> { "_id" : ObjectId("58e364dddc7a033f5c08c7c6")
} on : a Timestamp(2, 0) 
                        { "_id" : ObjectId("58e364dddc7a033f5c08c7c6") } -->> { "_id"
: ObjectId("58e364e0dc7a033f5c08c8b0") } on : c Timestamp(3, 0) 
                        { "_id" : ObjectId("58e364e0dc7a033f5c08c8b0") } -->> { "_id"
: { "$maxKey" : 1 } } on : b Timestamp(3, 1) 
                mandarinaTime3.MeasValue
                        shard key: { "_id" : 1 }
                        unique: false
                        balancing: true
                        chunks:
                                a       7
                                b       7
                                c       7
                        too many chunks to print, use verbose if you want to force print





--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message