incubator-blur-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aaron McCurry (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (BLUR-112) Allow for types to be set on blur tables
Date Thu, 06 Jun 2013 03:22:20 GMT

    [ https://issues.apache.org/jira/browse/BLUR-112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13676631#comment-13676631
] 

Aaron McCurry commented on BLUR-112:
------------------------------------

Yes I believe that this is the correct approach.  I have just pushed a big commit that cleaned
up all the old analyzer code and has replaced it with the new Double/Int/Float/Long/Text fields.

https://git-wip-us.apache.org/repos/asf?p=incubator-blur.git;a=commit;h=7bbf19d80aa3af80e5869b81827ffc8e8c700d87

So along with that work, the next piece is to make sure that an inbound Column type (this
is the attribute that will need to be added) does not conflict with any defined types.  For
example, in the table descriptor if the classname for a field is "family1.col1" is "int" then
it will be parsed into an IntField and indexed that way.  However if the client tries to insert
the Family/Column "family1.col1" as a "double" type in the Column then an exception needs
to be thrown.

For terminology let's call Column types that are defined in the TableDescriptor as statically
typed Columns.  And Columns that are added on the fly and dynamic Columns.

The dynamic Columns will need an additional guard (and it won't hurt to check on all the Columns).
 Once a Column type has been defined for a Column on any shard in any shard server it cannot
be redefined.  So the hardest issue with this situation is the race condition across servers.
 Example:  Shard Server 1 is getting mutates and receives a new dynamic Column let's call
it "family1.col1234" and it's type is "text" and at the same moment Shard Server 2 is getting
mutates and receives the same dynamic Column of "family1.col1234" and it's type is "double".
 One and only one of the types should win and the other should throw an exception.

There is code in 0.2.0 that provides a solution for this race condition by using ZooKeeper.

https://git-wip-us.apache.org/repos/asf?p=incubator-blur.git;a=blob;f=src/blur-util/src/main/java/org/apache/blur/zookeeper/ZkCachedMap.java;h=22eb9e64480e66b41da358d59060dc4331e1390c;hb=aef8938eb5987b5f19a3bd3260d5ebafcf6cf751

So we should pull in that class in from 0.2-dev.

I hope this response gives you an idea of what needs to be done conceptually and if you want
to work on it I can help direct you to that various portions of code that will need to be
modified.
                
> Allow for types to be set on blur tables
> ----------------------------------------
>
>                 Key: BLUR-112
>                 URL: https://issues.apache.org/jira/browse/BLUR-112
>             Project: Apache Blur
>          Issue Type: Improvement
>    Affects Versions: 0.1.5
>            Reporter: Aaron McCurry
>             Fix For: 0.1.5
>
>
> Create the ability for Blur to handle the default Lucene field types.  This should not
be tied to the table descriptor because types should be allowed to be added at runtime.  Also
2 new fields should be added to the TableDescriptor:
> 1. A strict types attribute.  If set to true, if a new column is added to the table and
there is no type mapping for it.  Throw an exception.  Set to false by default.
> 2. Default type is strict is set to false.  The default type should be text.
> Also, dynamic columns could be allowed if their name included the type.  Such as:
> The column name could be "col1" with a type of "int", in the Column struct in thrift
the name would be "col1/int" and if the type did not exist before the call it would be added.
> Thoughts?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message