atlas-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Apoorv Naik (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (ATLAS-2117) Basic search issues due to Titan Solr schema
Date Wed, 06 Sep 2017 06:10:00 GMT

     [ https://issues.apache.org/jira/browse/ATLAS-2117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Apoorv Naik updated ATLAS-2117:
-------------------------------
    Summary: Basic search issues due to Titan Solr schema  (was: Titan Indexer tokenization
issues)

> Basic search issues due to Titan Solr schema
> --------------------------------------------
>
>                 Key: ATLAS-2117
>                 URL: https://issues.apache.org/jira/browse/ATLAS-2117
>             Project: Atlas
>          Issue Type: Bug
>    Affects Versions: 0.8-incubating, 0.9-incubating, 0.8.1-incubating
>            Reporter: Apoorv Naik
>            Assignee: Apoorv Naik
>             Fix For: 0.8-incubating, 0.9-incubating, 0.8.1-incubating
>
>
> When using Solr as indexing backend, the tokenization of the string is performed using
the StandardTokenizerFactory which treats punctuations and special characters as delimiters
which results in the more indexed terms being associated with the associated vertex (document)
> Also there's a LowercaseFilterFactory which makes lookup case insensitive.
> This schema design doesn't work well for the current basic search enhancement (ATLAS-1880)
causing a lot of false positives/negatives when querying the index.
> The workaround/hack for this is to do an in-memory filtering when such schema violations
are found or push the entire attribute query down to the graph which might be in-efficient
and memory intensive. (Current JIRA will track this)
> Correct solution would be to re-index the existing data with a schema change and not
use the mentioned code workarounds for better performance of the search. (Should be taken
up in separate JIRA)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message