spark-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From OopsOutOfMemory <...@git.apache.org>
Subject [GitHub] spark pull request: [SPARK-5009][SQL][Bug FIx] allCaseVersions lea...
Date Tue, 06 Jan 2015 09:49:35 GMT
GitHub user OopsOutOfMemory opened a pull request:

    https://github.com/apache/spark/pull/3909

    [SPARK-5009][SQL][Bug FIx] allCaseVersions leads to stackoverflow.

    Currently, we use `allCaseVersion` function to match all possible case versions of  `Keyword`
that user passing into to sql query, like `SelecT * From SRc ` is also allowed in query syntax.
    
    A stackoverflow exception  appears when `Keyword` is too long since `allCaseVersion` will
generate 2^`Keyword.length` case versions. i.e. `Keyword("SERDEPROPERTIES")`  will generate
2^15 = 32768 possible case version.  This make __implicit function__   `asParser` throws the
SO exception.
    
    I think it is unnecessary to generate all kinds of case versions, this will cause SO when
keyword is too long and also do extra computing to generate all case versions of a given keyword.
    
    So I'd like to replace the `allCaseVersions` matching `Keyword` with a more simpler way,
 and this also can prevent SO exception, speed up parsing. 
    
    issues description is here: https://issues.apache.org/jira/browse/SPARK-5009


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/OopsOutOfMemory/spark allCaseVersions_SO

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/3909.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #3909
    
----
commit 2357d029cd6c5eb3a956955c3e48e86faa81f6ba
Author: OopsOutOfMemory <victorshengli@126.com>
Date:   2015-01-06T08:22:35Z

    initial lowercase version

commit 42be742a1bf8c3b350f8c42980265a705b2045a0
Author: OopsOutOfMemory <victorshengli@126.com>
Date:   2015-01-06T09:11:34Z

    refine code

commit b6f916d60a8c4e73f600bb75ea69f7f3a7b63c4e
Author: OopsOutOfMemory <victorshengli@126.com>
Date:   2015-01-06T09:15:19Z

    refine code

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Mime
View raw message