drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rahul Challapalli (JIRA)" <j...@apache.org>
Subject [jira] [Created] (DRILL-1810) Scalability issues with kvgen
Date Thu, 04 Dec 2014 23:59:12 GMT
Rahul Challapalli created DRILL-1810:

             Summary: Scalability issues with kvgen
                 Key: DRILL-1810
                 URL: https://issues.apache.org/jira/browse/DRILL-1810
             Project: Apache Drill
          Issue Type: Bug
          Components: Functions - Drill, Storage - JSON
            Reporter: Rahul Challapalli
            Assignee: Mehant Baid


Memory Settings 

Scalar Dataset :

The below query works fine for the above data set. However, if I just copy the same record
100000 times and execute the same query, kvgen fails with memory related issues
select kvgen(col1) from `json_kvgenflatten/kvgen-scalar-large.json`;

Complex Dataset :
  "data" : {
    "col1" : {
      "one" : [1,2,3,4],
      "two" : [{"a":"b"},{"c":"d"}]

Even in this case, the below query works fine for the above data set. However when we copy
the same record 100000 times we are hitting memory issues
select kvgen(data) from `json_kvgenflatten/kvgen-complex-large.json`;

I attached the log files for both the scenarios. Let me know if you need anything

This message was sent by Atlassian JIRA

View raw message