incubator-blur-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aaron McCurry <amccu...@gmail.com>
Subject Re: Column Types
Date Mon, 21 Oct 2013 18:15:00 GMT
On Mon, Oct 21, 2013 at 1:37 PM, Colton McInroy <colton@dosarrest.com>wrote:

> Hmm... What about column families?
>
> I tried typing this in the blur shell "definecolumn Program_syslog-ng
> event Date date" but it doesn't appear to change anything. When I do schema
> <table> I still see this...
>
> family : event
>         column   : Date
>                 fieldType : text
>

Once a type is defined it cannot be changed, if there was no error message
then we should fix that.  If you set a table to strictTypes = true then a
column can not be defined automatically via mutate.  Only through adding a
column definition.

http://incubator.apache.org/blur/docs/0.2.0/Blur.html#Struct_TableDescriptor

The date type also needs an extra argument how to parser the string.

definecolumn Program_syslog-ng event Date date -p dateFormat yyyyMMdd

http://incubator.apache.org/blur/docs/0.2.0/data-model.html#date_type


>
> Am I do something wrong? Does the schema have to be set during
> initialization of the table, or can it be done at any time?
>

It can be done at anytime, but once the field has been used it set for the
lifetime of the table.


>
> Also, the code you posted for a single table doesn't reference column
> families at all. Are field types column name specific only, so if you have
> the same column name in two different families both will be handled by that
> field type? No problem for me at all if this is the case, but for some
> people it may be a problem. Say for instance they have doc.matches:true and
> fields.matches:3, it may cause some problems.
>

I might be misunderstanding your question, but when you define a column to
a type with the definecolumn command you have to supply a family and a
column name.  In your example above that would be "event" for family and
"Date" for the column name.  For type definition they act like a bridge
between families/columns to field in lucene.  So the getFieldsForColumn and
getFieldsForSubColumn methods on FieldTypeDefinition, it gets passed a
Column with the family name and an Iterable of Fields are returned.

http://incubator.apache.org/blur/docs/0.2.0/site/blur-query/apidocs/org/apache/blur/analysis/FieldTypeDefinition.html


>
> I took a look at the ExampleType.java as well as the other current types
> and he really helped. I may write up a IP type definition for my own use as
> well as submit it to you for inclusion in apache blur if that it is
> desired. I know I probably won't be the only one to want that column type.


That would great!  Thanks.

Aaron


>
>
>
> Thanks,
> Colton McInroy
>
>  * Director of Security Engineering
>
>
> Phone
> (Toll Free)
> _US_    (888)-818-1344 Press 2
> _UK_    0-800-635-0551 Press 2
>
> My Extension    101
> 24/7 Support    support@dosarrest.com <mailto:support@dosarrest.com>
> Email   colton@dosarrest.com <mailto:colton@dosarrest.com>
> Website         http://www.dosarrest.com
>
> On 10/21/2013 6:10 AM, Aaron McCurry wrote:
>
>> The feature is not in 0.2.0 it is in 0.2.1 and 0.3.0.
>>
>> Here's the issue.
>>
>> https://issues.apache.org/**jira/browse/BLUR-258<https://issues.apache.org/jira/browse/BLUR-258>
>>
>> I haven't pushed a 0.2.1 website for documentation yet.  But the basics
>> are
>> create your type from FieldTypeDefinition or one of the other FTD classes
>> by extending them.
>>
>> Then to use the custom type, you can either add your custom type to the
>> entire cluster or per table.
>>
>> For Cluster Wide
>>
>> For cluster wide configuration you will need to add the new field types
>> into the blur-site.properties file on each server.
>>
>> blur.fieldtype.customfield1=**org.apache.blur.analysis.type.**
>> ExampleType1
>> blur.fieldtype.customfield2=**org.apache.blur.analysis.type.**
>> ExampleType2
>> ...
>>
>> Please note that the prefix of "blur.fieldtype." is all that is used from
>> the property name because the type gets it's name from the internal method
>> of "getName". However the property names will need to be unique within the
>> file.
>>
>> For Single Table
>>
>> For a single table configuration you will need to add the new field types
>> into the tableProperties map in the TableDescriptor as you define the
>> table.
>>
>> tableDescriptor.**putToTableProperties("blur.**fieldtype.customfield1",
>>      "org.apache.blur.analysis.**type.ExampleType1");
>> tableDescriptor.**putToTableProperties("blur.**fieldtype.customfield2",
>>      "org.apache.blur.analysis.**type.ExampleType2");
>> ...
>>
>> Please note that the prefix of "blur.fieldtype." is all that is used from
>> the property name because the type gets it's name from the internal method
>> of "getName". However the property names will need to be unique within the
>> map.
>>
>> Aaron
>>
>>
>>
>> On Sun, Oct 20, 2013 at 10:59 PM, Colton McInroy <colton@dosarrest.com
>> >wrote:
>>
>>  I noticed in the source the following column types are documented...
>>>
>>>    /**
>>>     * The field type for the column.  The built in types are:
>>>     * <ul>
>>>     * <li>text - Full text indexing.</li>
>>>     * <li>string - Indexed string literal</li>
>>>     * <li>int - Converted to an integer and indexed numerically.</li>
>>>     * <li>long - Converted to an long and indexed numerically.</li>
>>>     * <li>float - Converted to an float and indexed numerically.</li>
>>>     * <li>double - Converted to an double and indexed numerically.</li>
>>>     * <li>stored - Not indexed, only stored.</li>
>>>     * </ul>
>>>     */
>>>
>>> When I was looking at blur-query/src/main/java/org/***
>>> *apache/blur/analysis/
>>> **BaseFieldManager.java I came across this though...
>>>
>>> # grep addColumnDefinition blur-query/src/main/java/org/****
>>> apache/blur/analysis/****BaseFieldManager.java
>>>          addColumnDefinition(family, name, null,
>>> getDefaultMissingFieldLessInde****xing(), getDefaultMissingFieldType(),
>>>
>>>    public boolean addColumnDefinition(String family, String columnName,
>>> String subColumnName, boolean fieldLessIndexed,
>>>    public void addColumnDefinitionGisPointVec****tor(String family,
>>> String
>>>
>>> columnName) throws IOException {
>>>      addColumnDefinition(family, columnName, null, false,
>>> SpatialPointVectorStrategyFiel****dTypeDefinition.NAME, null);
>>>    public void addColumnDefinitionGisRecursiv****ePrefixTree(String
>>> family,
>>>
>>> String columnName) throws IOException {
>>>      addColumnDefinition(family, columnName, null, false,
>>> SpatialRecursivePrefixTreeStra****tegyFieldTypeDefinition.**NAME,
>>>
>>>    public void addColumnDefinitionDate(String family, String columnName,
>>> String format) throws IOException {
>>>      addColumnDefinition(family, columnName, null, false,
>>> DateFieldTypeDefinition.NAME, props);
>>>    public void addColumnDefinitionInt(String family, String columnName)
>>> throws IOException {
>>>      addColumnDefinition(family, columnName, null, false,
>>> IntFieldTypeDefinition.NAME, null);
>>>    public void addColumnDefinitionLong(String family, String columnName)
>>> throws IOException {
>>>      addColumnDefinition(family, columnName, null, false,
>>> LongFieldTypeDefinition.NAME, null);
>>>    public void addColumnDefinitionFloat(****String family, String
>>>
>>> columnName) throws IOException {
>>>      addColumnDefinition(family, columnName, null, false,
>>> FloatFieldTypeDefinition.NAME, null);
>>>    public void addColumnDefinitionDouble(****String family, String
>>>
>>> columnName) throws IOException {
>>>      addColumnDefinition(family, columnName, null, false,
>>> DoubleFieldTypeDefinition.****NAME, null);
>>>    public void addColumnDefinitionString(****String family, String
>>>
>>> columnName) throws IOException {
>>>      addColumnDefinition(family, columnName, null, false,
>>> StringFieldTypeDefinition.****NAME, null);
>>>
>>>    public void addColumnDefinitionText(String family, String columnName)
>>> throws IOException {
>>>      addColumnDefinition(family, columnName, null, false,
>>> TextFieldTypeDefinition.NAME, null);
>>>    public void addColumnDefinitionTextFieldLe****ss(String family,
>>> String
>>>
>>> columnName) throws IOException {
>>>      addColumnDefinition(family, columnName, null, true,
>>> TextFieldTypeDefinition.NAME, null);
>>>
>>> I am wondering how to specify these. I would like to programmatically set
>>> column types in certain situations, and I would like to be able to use
>>> the
>>> Date column type. Which I have been meaning to ask about....
>>>
>>> What is the best way to store a timestamp? What format, column type,
>>> etc... I'm guessing the Date column type, but I do not know how to set it
>>> right now. I noticed that the client (Iface object) has a
>>> addColumnDefinition, but it has different parameters than the above
>>> addColumnDefinition, and it's missing all of the ones for the different
>>> column types.
>>>
>>> I have one additional field type I would like to see, which is one for IP
>>> addresses...
>>>
>>>     * <li>date - Converted to a date and indexing.</li>
>>>     * <li>text - Full text indexing.</li>
>>>     * <li>string - Indexed string literal</li>
>>>     * <li>int - Converted to an integer and indexed numerically.</li>
>>>     * <li>long - Converted to an long and indexed numerically.</li>
>>>     * <li>float - Converted to an float and indexed numerically.</li>
>>>     * <li>double - Converted to an double and indexed numerically.</li>
>>>     * <li>ip - Converted to a InetAddress and indexed numerically.</li>
>>>
>>> --
>>> Thanks,
>>> Colton McInroy
>>>
>>>   * Director of Security Engineering
>>>
>>>
>>> Phone
>>> (Toll Free)
>>> _US_    (888)-818-1344 Press 2
>>> _UK_    0-800-635-0551 Press 2
>>>
>>> My Extension    101
>>> 24/7 Support    support@dosarrest.com <mailto:support@dosarrest.com>
>>> Email   colton@dosarrest.com <mailto:colton@dosarrest.com>
>>> Website         http://www.dosarrest.com
>>>
>>>
>>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message