incubator-blur-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aaron McCurry <>
Subject Re: Apache Drill : lucene format plugin
Date Tue, 11 Aug 2015 12:42:28 GMT
On Mon, Aug 10, 2015 at 4:55 PM, rahul challapalli <> wrote:

> Hi,
> I am writing a lucene format plugin for Apache Drill. Once this is
> in-place, we can extend it to run sql queries on top of blur tables from
> Apache Drill. Below are 2 problems I am trying to resolve. I posted it to
> the lucene dev list but did not receive any response. Trying it out here :)
> 1. Creating individual segment readers : Currently I am using the below
> code to create segment readers. I am trying to find out the minimum
> information that I need to serialize so that I can create a specific
> SegmentReader? I am trying to avoid serializing a lot of low-level state
> information or creating all segment readers every where
> segmentInfos =
> SegmentInfos.readLatestCommit(;
> segmentsFilename = segmentInfos.getSegmentsFileName();
> segmentReaders = new ArrayList<SegmentReader>();
> for (SegmentCommitInfo sci : segmentInfos.asList()) {
>   segmentReaders.add(new SegmentReader(sci, IOContext.READ));
> }

Here is the blur input format for reading index segments directly via map

This is the method that you may be most interested in looking at:

Checkout the BlurInputSplit for the low level bits that are needed to
reopen the data on the input side.

> 2. Serializing lucene query object : I translated the sql filter condition
> into a lucene query object(partly done). Now I am trying to serialize the
> lucene query object into a string and then de-serialize it back *without
> analyzing* the serialized string representation. Any pointers on this part?

Here's a mostly working package for converting lucene query objects to
writables for serialization.

Hope this helps!


> Any help is greatly appreciated
> - Rahul

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message