lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "hu andy" <andyh...@gmail.com>
Subject Re: SQL-Like Join in Lucene
Date Fri, 11 Aug 2006 04:37:45 GMT
>4. Search for records with filter.
if the filter returns a lot of ids, it willn' t be fast.
Recently I have a test. I customized a filter which get a list of ids from a
mysql database table of size 5000. Then I invoke the search(query, filter,
hitcollector), I took me more than 40s to retrieve the first 100 hits. Then
I try search(query, hitcollector), the fragment code is following:

        public MyHitCollector1(ArrayList list , String[] allIDs, ArrayList
idList)
        {
            this.list = list;
            this.idList = idList;
            this.allIDs = allIDs;
        }
        public override void Collect(int doc, float score)
        {
            if (score > 0.0f)
            {
                if (count < 1000)
                {
                    if (idList.BinarySearch(allIDs[doc]) >= 0)
                    {
                        list.Add(IndexSearcher.GetDocument(doc));
                        count++;
                    }
                }
                else
                    return;

            }
        }
you can get String[] allIDs = FieldCache.DEFAULT.GetStrings(IndexReader(),
"ID") in this way.

This method took about 8s to retrieve the first 100/1000 hits. Here the
database table has about 300,000 records


2006/8/11, Aleksei Valikov <valikov@gmx.net>:
>
> Hi.
>
> I'm investigating a possibility to make a "join" in Lucene/Compass.
>
> Here's the thread:
> http://forums.opensymphony.com/thread.jspa?threadID=39685&tstart=0
>
> I have records m:m entities. Entities hold indexed information. Records
> consist
> of entities. One entity may belong to many records.
> I would like to search for records having certain entity information.
>
> Entity documents contain indexable entity fields plus entity id.
> Record documents contain indexable record fielfs, record id and entity
> ids.
>
> I'd like to "search for records having entity ids in (search for entity
> ids
> where entity fields satisfy condition)".
>
> Currently I am using self-written InSetFilter to accomplish the task.
> 1. Search among entities by the given condition.
> 2. Put ids of the found entities into a set.
> 3. Create filter with this set.
> 4. Search for records with filter.
>
> The join is basically implemented by a filter:
>
>        public BitSet bits(IndexReader reader) throws IOException {
>
>                BitSet bits = new BitSet(reader.maxDoc());
>
>                int[] docs = new int[1];
>                int[] freqs = new int[1];
>                for (Iterator iterator = set.iterator(); iterator.hasNext();)
> {
>                        final String id = (String) iterator.next();
>                        final TermDocs termDocs = reader.termDocs(new
> Term(fieldName, id));
>                        final int count = termDocs.read(docs, freqs);
>                        if (count == 1) {
>                                bits.set(docs[0]);
>                        }
>                }
>                return bits;
>        }
>
> Is this approach ok or I missed something and there's an easier way to
> join?
>
> Thank you for you time.
>
> Bye.
> /lexi
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message