lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: [jira] [Assigned] (LUCENE-2205) Rework of the TermInfosReader class to remove the Terms[], TermInfos[], and the index pointer long[] and create a more memory efficient data structure.
Date Tue, 27 Sep 2011 17:42:38 GMT
Hi Anand,

If you open the issue, here:

    https://issues.apache.org/jira/browse/LUCENE-2205

Then click on the "All" tab, under Activity, it will show all attached
patches and comments in order.  Go the end and scroll back to the last
patch and that's the most recent one.

But note that this issue is very much in progress/flux now...

Mike McCandless

http://blog.mikemccandless.com

On Mon, Sep 26, 2011 at 10:39 PM,  <Anand.Nigam@rbs.com> wrote:
> Could someone please give me a pointer from where can I download the latest patch .
>
> Thanks & Regards,
> Anand
>
>
> -----Original Message-----
> From: Michael McCandless [mailto:lucene@mikemccandless.com]
> Sent: 26 September 2011 22:47
> To: dev@lucene.apache.org
> Subject: Re: [jira] [Assigned] (LUCENE-2205) Rework of the TermInfosReader class to remove
the Terms[], TermInfos[], and the index pointer long[] and create a more memory efficient
data structure.
>
> Is it possible you are using an old patch on the issue?
>
> The newer patch worked very recently for me on 3.x.
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
> On Mon, Sep 26, 2011 at 3:53 AM,  <Anand.Nigam@rbs.com> wrote:
>> Hi,
>>
>> I am using solr 3.4.0 and I want to use this patch. There is a compilation error
in 'TermInfosReader' class in the patch as it is not able to find following classes:
>>
>> import org.apache.lucene.util.cache.Cache;
>> import org.apache.lucene.util.cache.SimpleLRUCache;
>>
>> On google search I found that these classes were present in 'lucene-core-2.4.1' whereas
solr-3.4.0 has 'lucene-core-3.4.0 included in it which does have above classes.
>>
>> Thanks & Regards,
>> Anand
>>
>>
>> Anand Nigam
>> RBS Global Banking & Markets
>> Office: +91 124 492 5506
>>
>> -----Original Message-----
>> From: Michael McCandless (JIRA) [mailto:jira@apache.org]
>> Sent: 16 September 2011 01:03
>> To: dev@lucene.apache.org
>> Subject: [jira] [Assigned] (LUCENE-2205) Rework of the TermInfosReader class to remove
the Terms[], TermInfos[], and the index pointer long[] and create a more memory efficient
data structure.
>>
>>
>>     [
>> https://issues.apache.org/jira/browse/LUCENE-2205?page=com.atlassian.j
>> ira.plugin.system.issuetabpanels:all-tabpanel ]
>>
>> Michael McCandless reassigned LUCENE-2205:
>> ------------------------------------------
>>
>>    Assignee: Michael McCandless
>>
>>> Rework of the TermInfosReader class to remove the Terms[], TermInfos[], and the
index pointer long[] and create a more memory efficient data structure.
>>> ---------------------------------------------------------------------
>>> -
>>> ---------------------------------------------------------------------
>>> -
>>> -----------
>>>
>>>                 Key: LUCENE-2205
>>>                 URL:
>>> https://issues.apache.org/jira/browse/LUCENE-2205
>>>             Project: Lucene - Java
>>>          Issue Type: Improvement
>>>          Components: core/index
>>>         Environment: Java5
>>>            Reporter: Aaron McCurry
>>>            Assignee: Michael McCandless
>>>             Fix For: 3.5
>>>
>>>         Attachments: RandomAccessTest.java, TermInfosReader.java,
>>> TermInfosReaderIndex.java, TermInfosReaderIndexDefault.java,
>>> TermInfosReaderIndexSmall.java, patch-final.txt, rawoutput.txt
>>>
>>>
>>> Basically packing those three arrays into a byte array with an int array as an
index offset.
>>> The performance benefits are stagering on my test index (of size 6.2 GB, with
~1,000,000 documents and ~175,000,000 terms), the memory needed to load the terminfos into
memory were reduced to 17% of there original size.  From 291.5 MB to 49.7 MB.  The random
access speed has been made better by 1-2%, load time of the segments are ~40% faster as well,
and full GC's on my JVM were made 7 times faster.
>>> I have already performed the work and am offering this code as a patch.  Currently
all test in the trunk pass with this new code enabled.  I did write a system property switch
to allow for the original implementation to be used as well.
>>> -Dorg.apache.lucene.index.TermInfosReader=default or small I have
>>> also written a blog about this patch here is the link.
>>> http://www.nearinfinity.com/blogs/aaron_mccurry/my_first_lucene_patch.
>>> html
>>
>> --
>> This message is automatically generated by JIRA.
>> For more information on JIRA, see:
>> http://www.atlassian.com/software/jira
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
>> additional commands, e-mail: dev-help@lucene.apache.org
>>
>>
>> **********************************************************************
>> ************* The Royal Bank of Scotland plc. Registered in Scotland
>> No 90312.
>> Registered Office: 36 St Andrew Square, Edinburgh EH2 2YB.
>> Authorised and regulated by the Financial Services Authority. The
>> Royal Bank of Scotland N.V. is authorised and regulated by the De
>> Nederlandsche Bank and has its seat at Amsterdam, the Netherlands, and
>> is registered in the Commercial Register under number 33002587.
>> Registered Office: Gustav Mahlerlaan 350, Amsterdam, The Netherlands.
>> The Royal Bank of Scotland N.V. and The Royal Bank of Scotland plc are
>> authorised to act as agent for each other in certain jurisdictions.
>>
>> This e-mail message is confidential and for use by the addressee only.
>> If the message is received by anyone other than the addressee, please
>> return the message to the sender by replying to it and then delete the
>> message from your computer. Internet e-mails are not necessarily
>> secure. The Royal Bank of Scotland plc and The Royal Bank of Scotland
>> N.V. including its affiliates ("RBS group") does not accept
>> responsibility for changes made to this message after it was sent. For
>> the protection of RBS group and its clients and customers, and in
>> compliance with regulatory requirements, the contents of both incoming
>> and outgoing e-mail communications, which could include proprietary
>> information and Non-Public Personal Information, may be read by
>> authorised persons within RBS group other than the intended recipient(s).
>>
>> Whilst all reasonable care has been taken to avoid the transmission of
>> viruses, it is the responsibility of the recipient to ensure that the
>> onward transmission, opening or use of this message and any
>> attachments will not adversely affect its systems or data. No
>> responsibility is accepted by the RBS group in this regard and the
>> recipient should carry out such virus and other checks as it considers appropriate.
>>
>> Visit our website at www.rbs.com
>>
>> **********************************************************************
>> *************
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
>> additional commands, e-mail: dev-help@lucene.apache.org
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional commands, e-mail:
dev-help@lucene.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message