jackrabbit-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex Parvulescu (Updated) (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (JCR-3146) Text extraction may congest thread pool in the repository
Date Mon, 14 Nov 2011 14:08:52 GMT

     [ https://issues.apache.org/jira/browse/JCR-3146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Alex Parvulescu updated JCR-3146:
---------------------------------

    Attachment: JCR-3146.patch

The solution is to define another queue for the tasks considered as low priority, so that
they don't fill the execution queue.
Then, depending on the executor's load poll this queue for additional work items.

The secondary queue will only be used as needed, and the load is configurable via the system
property 
"org.apache.jackrabbit.core.JackrabbitThreadPool.maxLoadForLowPriorityTasks"
This property is meant to be used as a percent. 0 means disabled / the default is 75.

There are some timing issues with the indexing tests on account of this new async text extraction.
I've tried to fix all of them, but there may be more.

I haven't touched yet on the tika extraction that happens in a different process. I think
that will need some minor refactoring as well.

Attaching proposed patch.


                
> Text extraction may congest thread pool in the repository
> ---------------------------------------------------------
>
>                 Key: JCR-3146
>                 URL: https://issues.apache.org/jira/browse/JCR-3146
>             Project: Jackrabbit Content Repository
>          Issue Type: Improvement
>          Components: jackrabbit-core
>            Reporter: Alex Parvulescu
>            Priority: Minor
>         Attachments: JCR-3146.patch
>
>
> Text extraction congests the thread pool in the repository when e.g. many PDFs are loaded
into the workspace. Tasks submitted by the index merger are delayed because of that and will
result in many index segment folders.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message