drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Edmon Begoli (JIRA)" <j...@apache.org>
Subject [jira] [Created] (DRILL-3747) UDF for "fuzzy" string and similarity matching
Date Tue, 08 Sep 2015 14:55:46 GMT
Edmon Begoli created DRILL-3747:

             Summary: UDF for "fuzzy" string and similarity matching
                 Key: DRILL-3747
                 URL: https://issues.apache.org/jira/browse/DRILL-3747
             Project: Apache Drill
          Issue Type: New Feature
          Components: Functions - Drill
    Affects Versions: Future
            Reporter: Edmon Begoli
            Assignee: Mehant Baid
            Priority: Minor

I propose implementation of string/distance or distance matching functions similar to what
one finds in most of other databases - soundex, metaphone, levenshtein (and more advanced
variants such as levenshtein-damerau, jaro-winkler, etc.).

See fuzzystrmatch http://www.postgresql.org/docs/9.5/static/fuzzystrmatch.html, 
and pg_similarity http://pgsimilarity.projects.pgfoundry.org/
for inspiration.

This message was sent by Atlassian JIRA

View raw message