lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexander Filipchik <afilipc...@gmail.com>
Subject Issue with range queries on Lucene 6.6 using IntPoint
Date Sat, 01 Jul 2017 18:11:03 GMT
Not sure if I'm doi9ng something wrong, or there is a bug somewhere but:

I was trying to create a test index of a lot every second in a year and try query it (doesn't
have to be time, I'm using it to explain the problem).

Example document consists of 7 fields:
document.add(new IntPoint("year", year));
document.add(new IntPoint("month", month));
document.add(new IntPoint("hour", hour));
document.add(new IntPoint("day", day));
document.add(new IntPoint("minute", minute));
document.add(new IntPoint("second", second));
document.add(new StoredField("date", "y=" + year + "/m=" + month + "/d=" + day + "/h=" + hour
+ "/m=" + minute + "/s=" + second));
Then I tried to run range query like: 
BooleanQuery.Builder booleanQueryBuilder = new BooleanQuery.Builder()
        .add(IntPoint.newRangeQuery("year", 2016, 2020), BooleanClause.Occur.FILTER)
        .add(IntPoint.newRangeQuery("month", 1, 10), BooleanClause.Occur.FILTER)
        .add(IntPoint.newExactQuery("day", 1), BooleanClause.Occur.FILTER)
        .add(IntPoint.newExactQuery("hour", 1), BooleanClause.Occur.FILTER)
        .add(IntPoint.newExactQuery("minute", 1), BooleanClause.Occur.FILTER)
        .add(IntPoint.newExactQuery("second", 1), BooleanClause.Occur.FILTER);
To get all the first seconds of every hour seconds for month 1 to 10. While number of results
are correct, I'm getting wrong stored fields:
y=2017/m=2/d=1/h=1/m=1/s=1
y=2017/m=3/d=1/h=1/m=1/s=1
y=2017/m=1/d=2/h=1/m=26/s=42
y=2017/m=2/d=2/h=1/m=26/s=42
y=2017/m=3/d=2/h=1/m=26/s=42
y=2017/m=1/d=3/h=1/m=52/s=23
y=2017/m=2/d=3/h=1/m=52/s=23
y=2017/m=3/d=3/h=1/m=52/s=23
y=2017/m=1/d=5/h=1/m=18/s=4

As you can see months are repeating + and results are incorrect. Only 2 first results do match
the query.
If I remove seconds from the equation then everything is working ok. Is it something I'm doing
wrong or I'm hitting some limitations?  

Here is the test code:
import org.apache.lucene.analysis.standard.StandardAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.IntPoint;
import org.apache.lucene.document.StoredField;
import org.apache.lucene.index.DirectoryReader;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.index.IndexWriterConfig;
import org.apache.lucene.search.BooleanClause;
import org.apache.lucene.search.BooleanQuery;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.SimpleCollector;
import org.apache.lucene.store.RAMDirectory;

import java.io.IOException;



public class Test {
    public Test() throws IOException {
        final RAMDirectory directory = new RAMDirectory();
        IndexWriter iwriter = null;
        final IndexWriterConfig config = new IndexWriterConfig(new StandardAnalyzer());
        iwriter = new IndexWriter(directory, config);

        //Indexing every second for full 2017
        for (int year = 2017; year <= 2017; year++) {
            for (int month = 1; month <= 12; month++) {
                for (int day = 1; day <= 31; day++) {
                    for (int hour = 1; hour <= 1; hour++) {
                        for (int minute = 1; minute <= 60; minute++) {
                            for (int second = 1; second <= 60; second++) {
                                Document document = new Document();
                                document.add(new IntPoint("year", year));
                                document.add(new IntPoint("month", month));
                                document.add(new IntPoint("hour", hour));
                                document.add(new IntPoint("day", day));
                                document.add(new IntPoint("minute", minute));
                                document.add(new IntPoint("second", second));
                                document.add(new StoredField("date", "y=" + year + "/m=" +
month + "/d=" + day + "/h=" + hour + "/m=" + minute + "/s=" + second));
                                iwriter.addDocument(document);
                            }
                        }
                    }
                }
            }
        }

        iwriter.close();

        BooleanQuery.Builder booleanQueryBuilder = new BooleanQuery.Builder()
                .add(IntPoint.newRangeQuery("year", 2016, 2020), BooleanClause.Occur.FILTER)
                .add(IntPoint.newRangeQuery("month", 1, 10), BooleanClause.Occur.FILTER)
                .add(IntPoint.newExactQuery("day", 1), BooleanClause.Occur.FILTER)
                .add(IntPoint.newExactQuery("hour", 1), BooleanClause.Occur.FILTER)
                .add(IntPoint.newExactQuery("minute", 1), BooleanClause.Occur.FILTER)
                .add(IntPoint.newExactQuery("second", 1), BooleanClause.Occur.FILTER);

        final IndexSearcher searcher = new IndexSearcher(DirectoryReader.open(directory));
        searcher.search(booleanQueryBuilder.build(), new SimpleCollector() {
            @Override
            public void collect(int doc) throws IOException {
                Document document = searcher.getIndexReader().document(doc);
                System.out.println(document.get("date"));
            }

            public boolean needsScores() {
                return false;
            }
        });

    }
}

Thank you,
Alex

Mime
  • Unnamed multipart/alternative (inline, 7-Bit, 0 bytes)
View raw message