jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vikram Vaswani <vikram.vasw...@loudcloudsystems.com>
Subject RE: Problem deleting from DB datastore
Date Thu, 28 Aug 2014 11:45:10 GMT
Hi Thomas

We are uploading files to the repository using a DAV client. In this case, a Windows client
called CarrotDAV. When a file is uploaded through the client, we see a new entry appearing
in the datastore table. When the repository is explored with a tool like Jackrabbit Explorer,
the new file is seen as a node. However there is no versioning property visible on the node.

I checked and the version workspace is quite small and does not seem to be increasing as we
add new files. I would think this means that versioning is not in use - would you agree?

Could you think of any other reason why the unused files would not be deleted at garbage collection
time?

Thanks,
Vikram
________________________________________
From: Thomas Mueller <mueller@adobe.com>
Sent: Thursday, August 28, 2014 1:19 PM
To: users@jackrabbit.apache.org
Subject: Re: Problem deleting from DB datastore

Hi,

>How would I disable versioning?

I don't know your application so I can't tell if you are using it or not,
or how to disable it.


>Or could you suggest an easy test I could run to see if this is the
>problem?

Yes, there is a "version" workspace that contains the old versions
("schemaObjectPrefix" value="version_" below). You can check the size of
that. If this workspace is very large, then you are using versioning.

Regards,
Thomas



>
>I looked in the docs but couldn't find a way to do this. Many thanks for
>any help you can provide.
>
>My repository.xml is below for your reference:
>
><Repository>
>  <FileSystem class="org.apache.jackrabbit.core.fs.db.DbFileSystem">
>      <param name="driver" value="com.mysql.jdbc.Driver" />
>      <param name="url" value="jdbc:mysql://localhost:3306/jackrabbit" />
>      <param name="schema" value="mysql" />
>      <param name="user" value="root" />
>      <param name="password" value="" />
>      <param name="schemaObjectPrefix" value="fsrep_"/>
>  </FileSystem>
>  <Security appName="Jackrabbit">
>    <AccessManager
>class="org.apache.jackrabbit.core.security.SimpleAccessManager"></AccessMa
>nager>
>    <LoginModule
>class="org.apache.jackrabbit.core.security.SimpleLoginModule">
>      <param name="anonymousId" value="anonymous" />
>    </LoginModule>
>  </Security>
>  <DataStore class="org.apache.jackrabbit.core.data.db.DbDataStore">
>      <param name="driver" value="com.mysql.jdbc.Driver" />
>      <param name="url" value="jdbc:mysql://localhost:3306/jackrabbit" />
>      <param name="schema" value="mysql" /><!-- warning, this is not the
>schema name, it's the db type -->
>      <param name="user" value="root" />
>      <param name="password" value="" />
>      <param name="schemaObjectPrefix" value="ds_" />
>  </DataStore>
>  <Workspaces rootPath="${rep.home}/workspaces"
>defaultWorkspace="default" />
>  <Workspace name="default">
>    <FileSystem class="org.apache.jackrabbit.core.fs.db.DbFileSystem">
>      <param name="driver" value="com.mysql.jdbc.Driver" />
>      <param name="url" value="jdbc:mysql://localhost:3306/jackrabbit" />
>      <param name="schema" value="mysql" /><!-- warning, this is not the
>schema name, it's the db type -->
>      <param name="user" value="root" />
>      <param name="password" value="" />
>      <param name="schemaObjectPrefix" value="fsws_${wsp.name}_"/>
>    </FileSystem>
>    <PersistenceManager
>class="org.apache.jackrabbit.core.persistence.bundle.MySqlPersistenceManag
>er">
>      <param name="driver" value="com.mysql.jdbc.Driver" />
>      <param name="url" value="jdbc:mysql://localhost:3306/jackrabbit" />
>      <param name="schema" value="mysql" /><!-- warning, this is not the
>schema name, it's the db type -->
>      <param name="user" value="root" />
>      <param name="password" value="" />
>      <param name="schemaObjectPrefix" value="pm_${wsp.name}_" />
>      <param name="externalBLOBs" value="false" />
>    </PersistenceManager>
>    <SearchIndex
>class="org.apache.jackrabbit.core.query.lucene.SearchIndex">
>      <param name="path" value="${wsp.home}/index" />
>      <param name="useCompoundFile" value="true" />
>      <param name="minMergeDocs" value="100" />
>      <param name="volatileIdleTime" value="3" />
>      <param name="maxMergeDocs" value="100000" />
>      <param name="mergeFactor" value="10" />
>      <param name="maxFieldLength" value="10000" />
>      <param name="bufferSize" value="10" />
>      <param name="cacheSize" value="1000" />
>      <param name="forceConsistencyCheck" value="false" />
>      <param name="autoRepair" value="true" />
>      <param name="analyzer"
>value="org.apache.lucene.analysis.standard.StandardAnalyzer" />
>      <param name="queryClass"
>value="org.apache.jackrabbit.core.query.QueryImpl" />
>      <param name="respectDocumentOrder" value="true" />
>      <param name="resultFetchSize" value="2147483647" />
>      <param name="extractorPoolSize" value="3" />
>      <param name="extractorTimeout" value="100" />
>      <param name="extractorBackLogSize" value="100" />
>      <param name="textFilterClasses"
>        value="org.apache.jackrabbit.extractor.MsWordTextExtractor,
>               org.apache.jackrabbit.extractor.MsExcelTextExtractor,
>               org.apache.jackrabbit.extractor.MsPowerPointTextExtractor,
>               org.apache.jackrabbit.extractor.PdfTextExtractor,
>               org.apache.jackrabbit.extractor.OpenOfficeTextExtractor,
>               org.apache.jackrabbit.extractor.RTFTextExtractor,
>               org.apache.jackrabbit.extractor.HTMLTextExtractor,
>               org.apache.jackrabbit.extractor.PlainTextExtractor,
>               org.apache.jackrabbit.extractor.XMLTextExtractor" />
>    </SearchIndex>
>  </Workspace>
>  <Versioning rootPath="${rep.home}/version">
>    <FileSystem class="org.apache.jackrabbit.core.fs.db.DbFileSystem">
>      <param name="driver" value="com.mysql.jdbc.Driver" />
>      <param name="url" value="jdbc:mysql://localhost:3306/jackrabbit" />
>      <param name="schema" value="mysql" /><!-- warning, this is not the
>schema name, it's the db type -->
>      <param name="user" value="root" />
>      <param name="password" value="" />
>      <param name="schemaObjectPrefix" value="fsver_"/>
>    </FileSystem>
>    <PersistenceManager
>class="org.apache.jackrabbit.core.persistence.bundle.MySqlPersistenceManag
>er">
>      <param name="driver" value="com.mysql.jdbc.Driver" />
>      <param name="url" value="jdbc:mysql://localhost:3306/jackrabbit" />
>      <param name="schema" value="mysql" /><!-- warning, this is not the
>schema name, it's the db type -->
>      <param name="user" value="root" />
>      <param name="password" value="" />
>      <param name="schemaObjectPrefix" value="version_" />
>      <param name="externalBLOBs" value="false" />
>    </PersistenceManager>
>  </Versioning>
></Repository>
>
>
>
>________________________________________
>From: Thomas Mueller <mueller@adobe.com>
>Sent: Wednesday, August 27, 2014 2:12 PM
>To: users@jackrabbit.apache.org
>Subject: Re: Problem deleting from DB datastore
>
>Hi,
>
>The code looks good to me. Maybe you are using versioning, so the old
>binaries are still referenced?
>
>Regards,
>Thomas
>
>
>On 27/08/14 09:43, "Vikram Vaswani" <vikram.vaswani@loudcloudsystems.com>
>wrote:
>
>>Hi all
>>
>>
>>I have a Jackrabbit 2.8.0 setup and am using MySQL as the backend for the
>>data store, versioning etc. Our repository is quite active with 100+
>>documents being added daily. When a new document is added, I can see that
>>a new record is also added to the datastore table in MySQL
>>
>>
>>I am trying to find a way to have Jackrabbit automatically delete unused
>>(relating to deleted content) records from the datastore table. I have
>>read about Jackrabbit's garbage collection and I created a servlet to try
>>and use this.
>>
>>
>>Despite my best efforts however, when I run the servlet code there is no
>>removal of records from the datastore table, even though I know for a
>>fact that there are a high number of records referencing
>>previously-deleted docs.
>>
>>
>>My servlet code is below, runs on Tomcat, please could you help me
>>identify what I might be doing wrong?
>>
>>
>>@WebServlet("/GCServlet")
>>public class GCServlet extends HttpServlet {
>>
>>    private static org.apache.log4j.Logger logger =
>>org.apache.log4j.Logger.getLogger(GCServlet.class);
>>    private static final long serialVersionUID = 1L;
>>    private List<RepositoryManager> repositoryManagers;
>>    /**
>>     * @see HttpServlet#HttpServlet()
>>     */
>>    public GCServlet() {
>>        super();
>>        // TODO Auto-generated constructor stub
>>    }
>>
>>    /**
>>     * Runs the garbage collector for the given RepositoryManagers. If
>>multiple
>>     * repositories use the same data store, give all RepositoryManagers
>>in the
>>     * parameter list.
>>     *
>>     * @param rms
>>     * @throws RepositoryException
>>     */
>>    @SuppressWarnings(value="DM_GC")
>>    private int runDataStoreGarbageCollector(OutputStream out)
>>            throws RepositoryException {
>>        int result = 0;
>>        JackrabbitRepositoryFactory rf = new RepositoryFactoryImpl();
>>        Properties prop = new Properties();
>>        ServletContext servletContext = getServletContext();
>>        String path = servletContext.getRealPath(".");
>>        System.out.println("Real Path =>"+path);
>>        prop.setProperty("org.apache.jackrabbit.repository.home",
>>"/opt/jackrabbit");
>>        prop.setProperty("org.apache.jackrabbit.repository.conf",
>>"/opt/jackrabbit/repository.xml");
>>        JackrabbitRepository rep = (JackrabbitRepository)
>>rf.getRepository(prop);
>>     // need to login to start the repository
>>        Session session = rep.login(new SimpleCredentials("admin",
>>"admin".toCharArray()));
>>        RepositoryManager rm = rf.getRepositoryManager(rep);
>>        DataStoreGarbageCollector gc =
>>rm.createDataStoreGarbageCollector();
>>        try {
>>          gc.mark();
>>          gc.sweep();
>>
>>        } finally {
>>            gc.close();
>>        }
>>
>>        session.logout();
>>        rm.stop();
>>        return result;
>>    }
>>
>>    /**
>>     * @see HttpServlet#doGet(HttpServletRequest request,
>>HttpServletResponse response)
>>     */
>>    protected void doGet(HttpServletRequest request, HttpServletResponse
>>response) throws ServletException, IOException {
>>        String gcEnabled = request.getParameter("gcExecute");
>>        System.out.println("gcEnabled=="+gcEnabled+" length =
>>"+gcEnabled.length());
>>        response.setContentType("text/html");
>>        OutputStream outputStream = response.getOutputStream();
>>        outputStream.write("<html><body>Starting Garbage Collection
>><p/>".getBytes());
>>        if (gcEnabled.equals("1")) {
>>            try {
>>                int result = runDataStoreGarbageCollector(outputStream);
>>                outputStream.write(result);
>>
>>            } catch (RepositoryException e) {
>>                // TODO Auto-generated catch block
>>                e.printStackTrace();
>>                logger.error(e);
>>            }finally {
>>                outputStream.close();
>>            }
>>        }
>>        outputStream.write("</body></html>".getBytes());
>>        outputStream.flush();
>>    }
>>
>>
>>I have also attached the relevant section of the Tomcat log file.
>>
>>
>>Vikram
>>
>>
>>
>>________________________________
>>This email may contain proprietary, privileged and confidential
>>information and is sent for the intended recipient(s) only. If, by an
>>addressing or transmission error, this mail has been misdirected to you,
>>you are requested to notify us immediately by return email message and
>>delete this email and its attachments. You are also hereby notified that
>>any use, any form of reproduction, dissemination, copying, disclosure,
>>modification, distribution and/or publication of this email message,
>>contents or its attachment(s) other than by its intended recipient(s) is
>>strictly prohibited. Any opinions expressed in this email are those of
>>the individual and may not necessarily represent those of LoudCloud
>>Systems. Before opening attachment(s), please scan for viruses. It is
>>further notified that email transmission cannot be guaranteed to be
>>secure or error-free as information could be intercepted, corrupted,
>>lost, destroyed, arrive late or incomplete, or may contain viruses. The
>>sender therefore does not accept liability for any error or omission in
>>the contents of this message, which arise as a result of email
>>transmission. LoudCloud Systems Inc. and its subsidiaries do not accept
>>liability for damage caused by this email or any attachments and may
>>monitor email traffic.
>>________________________________
>
>
>________________________________
>This email may contain proprietary, privileged and confidential
>information and is sent for the intended recipient(s) only. If, by an
>addressing or transmission error, this mail has been misdirected to you,
>you are requested to notify us immediately by return email message and
>delete this email and its attachments. You are also hereby notified that
>any use, any form of reproduction, dissemination, copying, disclosure,
>modification, distribution and/or publication of this email message,
>contents or its attachment(s) other than by its intended recipient(s) is
>strictly prohibited. Any opinions expressed in this email are those of
>the individual and may not necessarily represent those of LoudCloud
>Systems. Before opening attachment(s), please scan for viruses. It is
>further notified that email transmission cannot be guaranteed to be
>secure or error-free as information could be intercepted, corrupted,
>lost, destroyed, arrive late or incomplete, or may contain viruses. The
>sender therefore does not accept liability for any error or omission in
>the contents of this message, which arise as a result of email
>transmission. LoudCloud Systems Inc. and its subsidiaries do not accept
>liability for damage caused by this email or any attachments and may
>monitor email traffic.
>________________________________


________________________________
This email may contain proprietary, privileged and confidential information and is sent for
the intended recipient(s) only. If, by an addressing or transmission error, this mail has
been misdirected to you, you are requested to notify us immediately by return email message
and delete this email and its attachments. You are also hereby notified that any use, any
form of reproduction, dissemination, copying, disclosure, modification, distribution and/or
publication of this email message, contents or its attachment(s) other than by its intended
recipient(s) is strictly prohibited. Any opinions expressed in this email are those of the
individual and may not necessarily represent those of LoudCloud Systems. Before opening attachment(s),
please scan for viruses. It is further notified that email transmission cannot be guaranteed
to be secure or error-free as information could be intercepted, corrupted, lost, destroyed,
arrive late or incomplete, or may contain viruses. The sender therefore does not accept liability
for any error or omission in the contents of this message, which arise as a result of email
transmission. LoudCloud Systems Inc. and its subsidiaries do not accept liability for damage
caused by this email or any attachments and may monitor email traffic.
________________________________

Mime
View raw message