drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From rahul challapalli <challapallira...@gmail.com>
Subject Re: Lucene Format Plugin
Date Sun, 09 Aug 2015 23:50:56 GMT
Below is the link to my branch which contains the changes related to the
format plugin.

https://github.com/rchallapalli/drill/tree/lucene/contrib/format-lucene

Any thoughts on how to handle contributions like this which still have some
work to be done?

- Rahul


On Mon, Aug 3, 2015 at 12:21 PM, rahul challapalli <
challapallirahul@gmail.com> wrote:

> Thanks Jason.
>
> I want to look at the solr plugin and see where we can collaborate or if
> we already duplicated part of the effort.
>
> I still need to push a few commits. I will share the code once I get these
> changes pushed.
>
> - Rahul
>
>
>
> On Mon, Aug 3, 2015 at 11:31 AM, Jason Altekruse <altekrusejason@gmail.com
> > wrote:
>
>> Hey Rahul,
>>
>> This is really cool! Thanks for all of the time you put into writing this,
>> I think we have a lot of available opportunities to reach new communities
>> with efforts like this.
>>
>> I noticed last week another contributor opened a JIRA for a solr plugin,
>> there might be a good opportunity for the two of you to join efforts, as I
>> believe he likely stated working on a lucene reader as part of his solr
>> work.
>>
>> Would you like to post a link to your work on Github or another public
>> host
>> of your code?
>>
>> https://issues.apache.org/jira/browse/DRILL-3585
>>
>> On Mon, Aug 3, 2015 at 2:29 AM, Stefán Baxter <stefan@activitystream.com>
>> wrote:
>>
>> > Hi,
>> >
>> > I'm pretty new around here but I just wanted to tell you how much your
>> work
>> > can benefit us. This is great!.
>> >
>> > Look forward to trying it out.
>> >
>> > Regards,
>> >  -Stefán
>> >
>> > On Mon, Aug 3, 2015 at 8:38 AM, rahul challapalli <
>> > challapallirahul@gmail.com> wrote:
>> >
>> > > Hello Drillers,
>> > >
>> > > I have been working on a lucene format plugin. In its current state,
>> the
>> > > below sample query successfully searches a lucene index and returns
>> the
>> > > results.
>> > >
>> > > select path from dfs_test.`/search-index` where
>> > contents='maxItemsPerBlock'
>> > > and contents = 'BlockTreeTermsIndex'
>> > >
>> > >
>> > >
>> > > *High Level Overview of Current Implementation:*
>> > >
>> > > *Parallelization:* A lucene segment is the lowest level of
>> > > parrallelization.
>> > > *Filter Pushdown:* Currently the format plugin is designed to push the
>> > > complete filter into the scan.
>> > > *Filter Evaluation:* Each condition in the filter is treated as a
>> lucene
>> > > TermQuery
>> > > <
>> > >
>> >
>> http://lucene.apache.org/core/5_2_0/core/org/apache/lucene/search/TermQuery.html
>> > > >
>> > > and multiple conditions are joined using a BooleanQuery
>> > > <
>> > >
>> >
>> http://lucene.apache.org/core/5_2_0/core/org/apache/lucene/search/BooleanQuery.html
>> > > >.
>> > > If we *do not* use a TermQuery, then we have to know the exact type of
>> > > Analyzer
>> > > <
>> > >
>> >
>> https://lucene.apache.org/core/5_2_1/core/org/apache/lucene/analysis/Analyzer.html
>> > > >
>> > > to use with each field in the query.
>> > >     Ex: 'contents' field might have been analyzed using a
>> > StandardAnalyzer
>> > > <
>> > >
>> >
>> https://lucene.apache.org/core/5_2_1/analyzers-common/org/apache/lucene/analysis/standard/StandardAnalyzer.html
>> > > >
>> > > and the 'path' field might not have been analyzed at all.
>> > > If desired, support for raw lucene queries with a reserved word
>> should be
>> > > easy to add.
>> > >     Ex: select * from dfs.`search-index` where searchQuery =
>> > > "+contents:maxItemsPerBlock
>> > > +path:/home/file.txt";
>> > > *Converting SqlFilter to Lucene Query:* Currently only "=" and "!="
>> > > operators are handled while converting a sql filter into a lucene
>> query.
>> > > For indexed fields this might be sufficient to handle a good number of
>> > > cases. For non-indexed fields operators like ">,<, like etc" need
to
>> be
>> > > handled.
>> > > *FileSystems:* Currently the format plugin only works on a local
>> > > filesystem.
>> > >
>> > >
>> > > Though far from complete, I want to work with the community to get
>> some
>> > > feedback and avoid any chance of duplication of work. Kindly let me
>> know
>> > > your thoughts
>> > >
>> > > - Rahul
>> > >
>> >
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message