jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thomas Mueller <muel...@adobe.com>
Subject Re: Problem deleting from DB datastore
Date Thu, 28 Aug 2014 14:55:12 GMT
Hi,

Sorry, but I'm afraid I ran out of options what the problem could be. I
would probably try exporting all data, and then re-creating the
repository, and see if this will save space.

Regards,
Thomas



On 28/08/14 13:45, "Vikram Vaswani" <vikram.vaswani@loudcloudsystems.com>
wrote:

>Hi Thomas
>
>We are uploading files to the repository using a DAV client. In this
>case, a Windows client called CarrotDAV. When a file is uploaded through
>the client, we see a new entry appearing in the datastore table. When the
>repository is explored with a tool like Jackrabbit Explorer, the new file
>is seen as a node. However there is no versioning property visible on the
>node.
>
>I checked and the version workspace is quite small and does not seem to
>be increasing as we add new files. I would think this means that
>versioning is not in use - would you agree?
>
>Could you think of any other reason why the unused files would not be
>deleted at garbage collection time?
>
>Thanks,
>Vikram
>________________________________________
>From: Thomas Mueller <mueller@adobe.com>
>Sent: Thursday, August 28, 2014 1:19 PM
>To: users@jackrabbit.apache.org
>Subject: Re: Problem deleting from DB datastore
>
>Hi,
>
>>How would I disable versioning?
>
>I don't know your application so I can't tell if you are using it or not,
>or how to disable it.
>
>
>>Or could you suggest an easy test I could run to see if this is the
>>problem?
>
>Yes, there is a "version" workspace that contains the old versions
>("schemaObjectPrefix" value="version_" below). You can check the size of
>that. If this workspace is very large, then you are using versioning.
>
>Regards,
>Thomas
>
>
>
>>
>>I looked in the docs but couldn't find a way to do this. Many thanks for
>>any help you can provide.
>>
>>My repository.xml is below for your reference:
>>
>><Repository>
>>  <FileSystem class="org.apache.jackrabbit.core.fs.db.DbFileSystem">
>>      <param name="driver" value="com.mysql.jdbc.Driver" />
>>      <param name="url" value="jdbc:mysql://localhost:3306/jackrabbit" />
>>      <param name="schema" value="mysql" />
>>      <param name="user" value="root" />
>>      <param name="password" value="" />
>>      <param name="schemaObjectPrefix" value="fsrep_"/>
>>  </FileSystem>
>>  <Security appName="Jackrabbit">
>>    <AccessManager
>>class="org.apache.jackrabbit.core.security.SimpleAccessManager"></AccessM
>>a
>>nager>
>>    <LoginModule
>>class="org.apache.jackrabbit.core.security.SimpleLoginModule">
>>      <param name="anonymousId" value="anonymous" />
>>    </LoginModule>
>>  </Security>
>>  <DataStore class="org.apache.jackrabbit.core.data.db.DbDataStore">
>>      <param name="driver" value="com.mysql.jdbc.Driver" />
>>      <param name="url" value="jdbc:mysql://localhost:3306/jackrabbit" />
>>      <param name="schema" value="mysql" /><!-- warning, this is not the
>>schema name, it's the db type -->
>>      <param name="user" value="root" />
>>      <param name="password" value="" />
>>      <param name="schemaObjectPrefix" value="ds_" />
>>  </DataStore>
>>  <Workspaces rootPath="${rep.home}/workspaces"
>>defaultWorkspace="default" />
>>  <Workspace name="default">
>>    <FileSystem class="org.apache.jackrabbit.core.fs.db.DbFileSystem">
>>      <param name="driver" value="com.mysql.jdbc.Driver" />
>>      <param name="url" value="jdbc:mysql://localhost:3306/jackrabbit" />
>>      <param name="schema" value="mysql" /><!-- warning, this is not the
>>schema name, it's the db type -->
>>      <param name="user" value="root" />
>>      <param name="password" value="" />
>>      <param name="schemaObjectPrefix" value="fsws_${wsp.name}_"/>
>>    </FileSystem>
>>    <PersistenceManager
>>class="org.apache.jackrabbit.core.persistence.bundle.MySqlPersistenceMana
>>g
>>er">
>>      <param name="driver" value="com.mysql.jdbc.Driver" />
>>      <param name="url" value="jdbc:mysql://localhost:3306/jackrabbit" />
>>      <param name="schema" value="mysql" /><!-- warning, this is not the
>>schema name, it's the db type -->
>>      <param name="user" value="root" />
>>      <param name="password" value="" />
>>      <param name="schemaObjectPrefix" value="pm_${wsp.name}_" />
>>      <param name="externalBLOBs" value="false" />
>>    </PersistenceManager>
>>    <SearchIndex
>>class="org.apache.jackrabbit.core.query.lucene.SearchIndex">
>>      <param name="path" value="${wsp.home}/index" />
>>      <param name="useCompoundFile" value="true" />
>>      <param name="minMergeDocs" value="100" />
>>      <param name="volatileIdleTime" value="3" />
>>      <param name="maxMergeDocs" value="100000" />
>>      <param name="mergeFactor" value="10" />
>>      <param name="maxFieldLength" value="10000" />
>>      <param name="bufferSize" value="10" />
>>      <param name="cacheSize" value="1000" />
>>      <param name="forceConsistencyCheck" value="false" />
>>      <param name="autoRepair" value="true" />
>>      <param name="analyzer"
>>value="org.apache.lucene.analysis.standard.StandardAnalyzer" />
>>      <param name="queryClass"
>>value="org.apache.jackrabbit.core.query.QueryImpl" />
>>      <param name="respectDocumentOrder" value="true" />
>>      <param name="resultFetchSize" value="2147483647" />
>>      <param name="extractorPoolSize" value="3" />
>>      <param name="extractorTimeout" value="100" />
>>      <param name="extractorBackLogSize" value="100" />
>>      <param name="textFilterClasses"
>>        value="org.apache.jackrabbit.extractor.MsWordTextExtractor,
>>               org.apache.jackrabbit.extractor.MsExcelTextExtractor,
>>               org.apache.jackrabbit.extractor.MsPowerPointTextExtractor,
>>               org.apache.jackrabbit.extractor.PdfTextExtractor,
>>               org.apache.jackrabbit.extractor.OpenOfficeTextExtractor,
>>               org.apache.jackrabbit.extractor.RTFTextExtractor,
>>               org.apache.jackrabbit.extractor.HTMLTextExtractor,
>>               org.apache.jackrabbit.extractor.PlainTextExtractor,
>>               org.apache.jackrabbit.extractor.XMLTextExtractor" />
>>    </SearchIndex>
>>  </Workspace>
>>  <Versioning rootPath="${rep.home}/version">
>>    <FileSystem class="org.apache.jackrabbit.core.fs.db.DbFileSystem">
>>      <param name="driver" value="com.mysql.jdbc.Driver" />
>>      <param name="url" value="jdbc:mysql://localhost:3306/jackrabbit" />
>>      <param name="schema" value="mysql" /><!-- warning, this is not the
>>schema name, it's the db type -->
>>      <param name="user" value="root" />
>>      <param name="password" value="" />
>>      <param name="schemaObjectPrefix" value="fsver_"/>
>>    </FileSystem>
>>    <PersistenceManager
>>class="org.apache.jackrabbit.core.persistence.bundle.MySqlPersistenceMana
>>g
>>er">
>>      <param name="driver" value="com.mysql.jdbc.Driver" />
>>      <param name="url" value="jdbc:mysql://localhost:3306/jackrabbit" />
>>      <param name="schema" value="mysql" /><!-- warning, this is not the
>>schema name, it's the db type -->
>>      <param name="user" value="root" />
>>      <param name="password" value="" />
>>      <param name="schemaObjectPrefix" value="version_" />
>>      <param name="externalBLOBs" value="false" />
>>    </PersistenceManager>
>>  </Versioning>
>></Repository>
>>
>>
>>
>>________________________________________
>>From: Thomas Mueller <mueller@adobe.com>
>>Sent: Wednesday, August 27, 2014 2:12 PM
>>To: users@jackrabbit.apache.org
>>Subject: Re: Problem deleting from DB datastore
>>
>>Hi,
>>
>>The code looks good to me. Maybe you are using versioning, so the old
>>binaries are still referenced?
>>
>>Regards,
>>Thomas
>>
>>
>>On 27/08/14 09:43, "Vikram Vaswani" <vikram.vaswani@loudcloudsystems.com>
>>wrote:
>>
>>>Hi all
>>>
>>>
>>>I have a Jackrabbit 2.8.0 setup and am using MySQL as the backend for
>>>the
>>>data store, versioning etc. Our repository is quite active with 100+
>>>documents being added daily. When a new document is added, I can see
>>>that
>>>a new record is also added to the datastore table in MySQL
>>>
>>>
>>>I am trying to find a way to have Jackrabbit automatically delete unused
>>>(relating to deleted content) records from the datastore table. I have
>>>read about Jackrabbit's garbage collection and I created a servlet to
>>>try
>>>and use this.
>>>
>>>
>>>Despite my best efforts however, when I run the servlet code there is no
>>>removal of records from the datastore table, even though I know for a
>>>fact that there are a high number of records referencing
>>>previously-deleted docs.
>>>
>>>
>>>My servlet code is below, runs on Tomcat, please could you help me
>>>identify what I might be doing wrong?
>>>
>>>
>>>@WebServlet("/GCServlet")
>>>public class GCServlet extends HttpServlet {
>>>
>>>    private static org.apache.log4j.Logger logger =
>>>org.apache.log4j.Logger.getLogger(GCServlet.class);
>>>    private static final long serialVersionUID = 1L;
>>>    private List<RepositoryManager> repositoryManagers;
>>>    /**
>>>     * @see HttpServlet#HttpServlet()
>>>     */
>>>    public GCServlet() {
>>>        super();
>>>        // TODO Auto-generated constructor stub
>>>    }
>>>
>>>    /**
>>>     * Runs the garbage collector for the given RepositoryManagers. If
>>>multiple
>>>     * repositories use the same data store, give all RepositoryManagers
>>>in the
>>>     * parameter list.
>>>     *
>>>     * @param rms
>>>     * @throws RepositoryException
>>>     */
>>>    @SuppressWarnings(value="DM_GC")
>>>    private int runDataStoreGarbageCollector(OutputStream out)
>>>            throws RepositoryException {
>>>        int result = 0;
>>>        JackrabbitRepositoryFactory rf = new RepositoryFactoryImpl();
>>>        Properties prop = new Properties();
>>>        ServletContext servletContext = getServletContext();
>>>        String path = servletContext.getRealPath(".");
>>>        System.out.println("Real Path =>"+path);
>>>        prop.setProperty("org.apache.jackrabbit.repository.home",
>>>"/opt/jackrabbit");
>>>        prop.setProperty("org.apache.jackrabbit.repository.conf",
>>>"/opt/jackrabbit/repository.xml");
>>>        JackrabbitRepository rep = (JackrabbitRepository)
>>>rf.getRepository(prop);
>>>     // need to login to start the repository
>>>        Session session = rep.login(new SimpleCredentials("admin",
>>>"admin".toCharArray()));
>>>        RepositoryManager rm = rf.getRepositoryManager(rep);
>>>        DataStoreGarbageCollector gc =
>>>rm.createDataStoreGarbageCollector();
>>>        try {
>>>          gc.mark();
>>>          gc.sweep();
>>>
>>>        } finally {
>>>            gc.close();
>>>        }
>>>
>>>        session.logout();
>>>        rm.stop();
>>>        return result;
>>>    }
>>>
>>>    /**
>>>     * @see HttpServlet#doGet(HttpServletRequest request,
>>>HttpServletResponse response)
>>>     */
>>>    protected void doGet(HttpServletRequest request, HttpServletResponse
>>>response) throws ServletException, IOException {
>>>        String gcEnabled = request.getParameter("gcExecute");
>>>        System.out.println("gcEnabled=="+gcEnabled+" length =
>>>"+gcEnabled.length());
>>>        response.setContentType("text/html");
>>>        OutputStream outputStream = response.getOutputStream();
>>>        outputStream.write("<html><body>Starting Garbage Collection
>>><p/>".getBytes());
>>>        if (gcEnabled.equals("1")) {
>>>            try {
>>>                int result = runDataStoreGarbageCollector(outputStream);
>>>                outputStream.write(result);
>>>
>>>            } catch (RepositoryException e) {
>>>                // TODO Auto-generated catch block
>>>                e.printStackTrace();
>>>                logger.error(e);
>>>            }finally {
>>>                outputStream.close();
>>>            }
>>>        }
>>>        outputStream.write("</body></html>".getBytes());
>>>        outputStream.flush();
>>>    }
>>>
>>>
>>>I have also attached the relevant section of the Tomcat log file.
>>>
>>>
>>>Vikram
>>>
>>>
>>>
>>>________________________________
>>>This email may contain proprietary, privileged and confidential
>>>information and is sent for the intended recipient(s) only. If, by an
>>>addressing or transmission error, this mail has been misdirected to you,
>>>you are requested to notify us immediately by return email message and
>>>delete this email and its attachments. You are also hereby notified that
>>>any use, any form of reproduction, dissemination, copying, disclosure,
>>>modification, distribution and/or publication of this email message,
>>>contents or its attachment(s) other than by its intended recipient(s) is
>>>strictly prohibited. Any opinions expressed in this email are those of
>>>the individual and may not necessarily represent those of LoudCloud
>>>Systems. Before opening attachment(s), please scan for viruses. It is
>>>further notified that email transmission cannot be guaranteed to be
>>>secure or error-free as information could be intercepted, corrupted,
>>>lost, destroyed, arrive late or incomplete, or may contain viruses. The
>>>sender therefore does not accept liability for any error or omission in
>>>the contents of this message, which arise as a result of email
>>>transmission. LoudCloud Systems Inc. and its subsidiaries do not accept
>>>liability for damage caused by this email or any attachments and may
>>>monitor email traffic.
>>>________________________________
>>
>>
>>________________________________
>>This email may contain proprietary, privileged and confidential
>>information and is sent for the intended recipient(s) only. If, by an
>>addressing or transmission error, this mail has been misdirected to you,
>>you are requested to notify us immediately by return email message and
>>delete this email and its attachments. You are also hereby notified that
>>any use, any form of reproduction, dissemination, copying, disclosure,
>>modification, distribution and/or publication of this email message,
>>contents or its attachment(s) other than by its intended recipient(s) is
>>strictly prohibited. Any opinions expressed in this email are those of
>>the individual and may not necessarily represent those of LoudCloud
>>Systems. Before opening attachment(s), please scan for viruses. It is
>>further notified that email transmission cannot be guaranteed to be
>>secure or error-free as information could be intercepted, corrupted,
>>lost, destroyed, arrive late or incomplete, or may contain viruses. The
>>sender therefore does not accept liability for any error or omission in
>>the contents of this message, which arise as a result of email
>>transmission. LoudCloud Systems Inc. and its subsidiaries do not accept
>>liability for damage caused by this email or any attachments and may
>>monitor email traffic.
>>________________________________
>
>
>________________________________
>This email may contain proprietary, privileged and confidential
>information and is sent for the intended recipient(s) only. If, by an
>addressing or transmission error, this mail has been misdirected to you,
>you are requested to notify us immediately by return email message and
>delete this email and its attachments. You are also hereby notified that
>any use, any form of reproduction, dissemination, copying, disclosure,
>modification, distribution and/or publication of this email message,
>contents or its attachment(s) other than by its intended recipient(s) is
>strictly prohibited. Any opinions expressed in this email are those of
>the individual and may not necessarily represent those of LoudCloud
>Systems. Before opening attachment(s), please scan for viruses. It is
>further notified that email transmission cannot be guaranteed to be
>secure or error-free as information could be intercepted, corrupted,
>lost, destroyed, arrive late or incomplete, or may contain viruses. The
>sender therefore does not accept liability for any error or omission in
>the contents of this message, which arise as a result of email
>transmission. LoudCloud Systems Inc. and its subsidiaries do not accept
>liability for damage caused by this email or any attachments and may
>monitor email traffic.
>________________________________


Mime
View raw message