Return-Path: X-Original-To: apmail-jackrabbit-users-archive@minotaur.apache.org Delivered-To: apmail-jackrabbit-users-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 778E21112B for ; Thu, 28 Aug 2014 14:55:56 +0000 (UTC) Received: (qmail 16137 invoked by uid 500); 28 Aug 2014 14:55:56 -0000 Delivered-To: apmail-jackrabbit-users-archive@jackrabbit.apache.org Received: (qmail 16080 invoked by uid 500); 28 Aug 2014 14:55:56 -0000 Mailing-List: contact users-help@jackrabbit.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: users@jackrabbit.apache.org Delivered-To: mailing list users@jackrabbit.apache.org Received: (qmail 16069 invoked by uid 99); 28 Aug 2014 14:55:55 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 28 Aug 2014 14:55:55 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of mueller@adobe.com designates 207.46.163.203 as permitted sender) Received: from [207.46.163.203] (HELO na01-bl2-obe.outbound.protection.outlook.com) (207.46.163.203) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 28 Aug 2014 14:55:29 +0000 Received: from BLUPR02MB342.namprd02.prod.outlook.com (10.141.77.149) by BLUPR02MB341.namprd02.prod.outlook.com (10.141.77.144) with Microsoft SMTP Server (TLS) id 15.0.1015.19; Thu, 28 Aug 2014 14:55:13 +0000 Received: from BLUPR02MB342.namprd02.prod.outlook.com ([10.141.77.149]) by BLUPR02MB342.namprd02.prod.outlook.com ([10.141.77.149]) with mapi id 15.00.1015.018; Thu, 28 Aug 2014 14:55:12 +0000 From: Thomas Mueller To: "users@jackrabbit.apache.org" Subject: Re: Problem deleting from DB datastore Thread-Topic: Problem deleting from DB datastore Thread-Index: AQHPwcqmO3Sz0sXmiEGYEC1Abz6seZvkQ4sAgACJeyGAAPoTgIAAH3zHgABXfQA= Date: Thu, 28 Aug 2014 14:55:12 +0000 Message-ID: References: <126e977e977b44efbb9089dc5f97c46a@SIXPR04MB319.apcprd04.prod.outlook.com> <1409166064505.20454@loudcloudsystems.com> <1409226310375.64472@loudcloudsystems.com> In-Reply-To: <1409226310375.64472@loudcloudsystems.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: user-agent: Microsoft-MacOutlook/14.4.1.140326 x-ms-exchange-transport-fromentityheader: Hosted x-originating-ip: [5.172.141.17] x-microsoft-antispam: BCL:0;PCL:0;RULEID:;UriScan:; x-forefront-prvs: 031763BCAF x-forefront-antispam-report: SFV:NSPM;SFS:(6009001)(199003)(189002)(377454003)(66654002)(24454002)(164054003)(479174003)(53754006)(51704005)(90102001)(107886001)(2351001)(4396001)(54356999)(66066001)(74662001)(83322001)(107046002)(110136001)(93886004)(2656002)(80022001)(86362001)(92726001)(20776003)(31966008)(85306004)(99396002)(19580405001)(74502001)(64706001)(106356001)(101416001)(105586002)(106116001)(87936001)(92566001)(77982001)(81342001)(81542001)(79102001)(76482001)(36756003)(85852003)(46102001)(19580395003)(95666004)(83506001)(99286002)(76176999)(21056001)(83072002)(71446004)(50986999)(2501001);DIR:OUT;SFP:;SCL:1;SRVR:BLUPR02MB341;H:BLUPR02MB342.namprd02.prod.outlook.com;FPR:;MLV:sfv;PTR:InfoNoRecords;A:1;MX:1;LANG:en; Content-Type: text/plain; charset="us-ascii" Content-ID: <86258FD4DB29754AB0B085661A241CEB@namprd02.prod.outlook.com> Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: adobe.com X-Virus-Checked: Checked by ClamAV on apache.org Hi, Sorry, but I'm afraid I ran out of options what the problem could be. I would probably try exporting all data, and then re-creating the repository, and see if this will save space. Regards, Thomas On 28/08/14 13:45, "Vikram Vaswani" wrote: >Hi Thomas > >We are uploading files to the repository using a DAV client. In this >case, a Windows client called CarrotDAV. When a file is uploaded through >the client, we see a new entry appearing in the datastore table. When the >repository is explored with a tool like Jackrabbit Explorer, the new file >is seen as a node. However there is no versioning property visible on the >node. > >I checked and the version workspace is quite small and does not seem to >be increasing as we add new files. I would think this means that >versioning is not in use - would you agree? > >Could you think of any other reason why the unused files would not be >deleted at garbage collection time? > >Thanks, >Vikram >________________________________________ >From: Thomas Mueller >Sent: Thursday, August 28, 2014 1:19 PM >To: users@jackrabbit.apache.org >Subject: Re: Problem deleting from DB datastore > >Hi, > >>How would I disable versioning? > >I don't know your application so I can't tell if you are using it or not, >or how to disable it. > > >>Or could you suggest an easy test I could run to see if this is the >>problem? > >Yes, there is a "version" workspace that contains the old versions >("schemaObjectPrefix" value=3D"version_" below). You can check the size of >that. If this workspace is very large, then you are using versioning. > >Regards, >Thomas > > > >> >>I looked in the docs but couldn't find a way to do this. Many thanks for >>any help you can provide. >> >>My repository.xml is below for your reference: >> >> >> >> >> >> >> >> >> >> >> >> >class=3D"org.apache.jackrabbit.core.security.SimpleAccessManager">>a >>nager> >> >class=3D"org.apache.jackrabbit.core.security.SimpleLoginModule"> >> >> >> >> >> >> >> >> >> >> >> >> >defaultWorkspace=3D"default" /> >> >> >> >> >> >> >> >> >> >> >class=3D"org.apache.jackrabbit.core.persistence.bundle.MySqlPersistenceMa= na >>g >>er"> >> >> >> >> >> >> >> >> >> >class=3D"org.apache.jackrabbit.core.query.lucene.SearchIndex"> >> >> >> >> >> >> >> >> >> >> >> >> >value=3D"org.apache.lucene.analysis.standard.StandardAnalyzer" /> >> >value=3D"org.apache.jackrabbit.core.query.QueryImpl" /> >> >> >> >> >> >> > value=3D"org.apache.jackrabbit.extractor.MsWordTextExtractor, >> org.apache.jackrabbit.extractor.MsExcelTextExtractor, >> org.apache.jackrabbit.extractor.MsPowerPointTextExtractor, >> org.apache.jackrabbit.extractor.PdfTextExtractor, >> org.apache.jackrabbit.extractor.OpenOfficeTextExtractor, >> org.apache.jackrabbit.extractor.RTFTextExtractor, >> org.apache.jackrabbit.extractor.HTMLTextExtractor, >> org.apache.jackrabbit.extractor.PlainTextExtractor, >> org.apache.jackrabbit.extractor.XMLTextExtractor" /> >> >> >> >> >> >> >> >> >> >> >> >> >class=3D"org.apache.jackrabbit.core.persistence.bundle.MySqlPersistenceMa= na >>g >>er"> >> >> >> >> >> >> >> >> >> >> >> >> >> >>________________________________________ >>From: Thomas Mueller >>Sent: Wednesday, August 27, 2014 2:12 PM >>To: users@jackrabbit.apache.org >>Subject: Re: Problem deleting from DB datastore >> >>Hi, >> >>The code looks good to me. Maybe you are using versioning, so the old >>binaries are still referenced? >> >>Regards, >>Thomas >> >> >>On 27/08/14 09:43, "Vikram Vaswani" >>wrote: >> >>>Hi all >>> >>> >>>I have a Jackrabbit 2.8.0 setup and am using MySQL as the backend for >>>the >>>data store, versioning etc. Our repository is quite active with 100+ >>>documents being added daily. When a new document is added, I can see >>>that >>>a new record is also added to the datastore table in MySQL >>> >>> >>>I am trying to find a way to have Jackrabbit automatically delete unused >>>(relating to deleted content) records from the datastore table. I have >>>read about Jackrabbit's garbage collection and I created a servlet to >>>try >>>and use this. >>> >>> >>>Despite my best efforts however, when I run the servlet code there is no >>>removal of records from the datastore table, even though I know for a >>>fact that there are a high number of records referencing >>>previously-deleted docs. >>> >>> >>>My servlet code is below, runs on Tomcat, please could you help me >>>identify what I might be doing wrong? >>> >>> >>>@WebServlet("/GCServlet") >>>public class GCServlet extends HttpServlet { >>> >>> private static org.apache.log4j.Logger logger =3D >>>org.apache.log4j.Logger.getLogger(GCServlet.class); >>> private static final long serialVersionUID =3D 1L; >>> private List repositoryManagers; >>> /** >>> * @see HttpServlet#HttpServlet() >>> */ >>> public GCServlet() { >>> super(); >>> // TODO Auto-generated constructor stub >>> } >>> >>> /** >>> * Runs the garbage collector for the given RepositoryManagers. If >>>multiple >>> * repositories use the same data store, give all RepositoryManagers >>>in the >>> * parameter list. >>> * >>> * @param rms >>> * @throws RepositoryException >>> */ >>> @SuppressWarnings(value=3D"DM_GC") >>> private int runDataStoreGarbageCollector(OutputStream out) >>> throws RepositoryException { >>> int result =3D 0; >>> JackrabbitRepositoryFactory rf =3D new RepositoryFactoryImpl(); >>> Properties prop =3D new Properties(); >>> ServletContext servletContext =3D getServletContext(); >>> String path =3D servletContext.getRealPath("."); >>> System.out.println("Real Path =3D>"+path); >>> prop.setProperty("org.apache.jackrabbit.repository.home", >>>"/opt/jackrabbit"); >>> prop.setProperty("org.apache.jackrabbit.repository.conf", >>>"/opt/jackrabbit/repository.xml"); >>> JackrabbitRepository rep =3D (JackrabbitRepository) >>>rf.getRepository(prop); >>> // need to login to start the repository >>> Session session =3D rep.login(new SimpleCredentials("admin", >>>"admin".toCharArray())); >>> RepositoryManager rm =3D rf.getRepositoryManager(rep); >>> DataStoreGarbageCollector gc =3D >>>rm.createDataStoreGarbageCollector(); >>> try { >>> gc.mark(); >>> gc.sweep(); >>> >>> } finally { >>> gc.close(); >>> } >>> >>> session.logout(); >>> rm.stop(); >>> return result; >>> } >>> >>> /** >>> * @see HttpServlet#doGet(HttpServletRequest request, >>>HttpServletResponse response) >>> */ >>> protected void doGet(HttpServletRequest request, HttpServletResponse >>>response) throws ServletException, IOException { >>> String gcEnabled =3D request.getParameter("gcExecute"); >>> System.out.println("gcEnabled=3D=3D"+gcEnabled+" length =3D >>>"+gcEnabled.length()); >>> response.setContentType("text/html"); >>> OutputStream outputStream =3D response.getOutputStream(); >>> outputStream.write("Starting Garbage Collection >>>

".getBytes()); >>> if (gcEnabled.equals("1")) { >>> try { >>> int result =3D runDataStoreGarbageCollector(outputStream= ); >>> outputStream.write(result); >>> >>> } catch (RepositoryException e) { >>> // TODO Auto-generated catch block >>> e.printStackTrace(); >>> logger.error(e); >>> }finally { >>> outputStream.close(); >>> } >>> } >>> outputStream.write("".getBytes()); >>> outputStream.flush(); >>> } >>> >>> >>>I have also attached the relevant section of the Tomcat log file. >>> >>> >>>Vikram >>> >>> >>> >>>________________________________ >>>This email may contain proprietary, privileged and confidential >>>information and is sent for the intended recipient(s) only. If, by an >>>addressing or transmission error, this mail has been misdirected to you, >>>you are requested to notify us immediately by return email message and >>>delete this email and its attachments. You are also hereby notified that >>>any use, any form of reproduction, dissemination, copying, disclosure, >>>modification, distribution and/or publication of this email message, >>>contents or its attachment(s) other than by its intended recipient(s) is >>>strictly prohibited. Any opinions expressed in this email are those of >>>the individual and may not necessarily represent those of LoudCloud >>>Systems. Before opening attachment(s), please scan for viruses. It is >>>further notified that email transmission cannot be guaranteed to be >>>secure or error-free as information could be intercepted, corrupted, >>>lost, destroyed, arrive late or incomplete, or may contain viruses. The >>>sender therefore does not accept liability for any error or omission in >>>the contents of this message, which arise as a result of email >>>transmission. LoudCloud Systems Inc. and its subsidiaries do not accept >>>liability for damage caused by this email or any attachments and may >>>monitor email traffic. >>>________________________________ >> >> >>________________________________ >>This email may contain proprietary, privileged and confidential >>information and is sent for the intended recipient(s) only. If, by an >>addressing or transmission error, this mail has been misdirected to you, >>you are requested to notify us immediately by return email message and >>delete this email and its attachments. You are also hereby notified that >>any use, any form of reproduction, dissemination, copying, disclosure, >>modification, distribution and/or publication of this email message, >>contents or its attachment(s) other than by its intended recipient(s) is >>strictly prohibited. Any opinions expressed in this email are those of >>the individual and may not necessarily represent those of LoudCloud >>Systems. Before opening attachment(s), please scan for viruses. It is >>further notified that email transmission cannot be guaranteed to be >>secure or error-free as information could be intercepted, corrupted, >>lost, destroyed, arrive late or incomplete, or may contain viruses. The >>sender therefore does not accept liability for any error or omission in >>the contents of this message, which arise as a result of email >>transmission. LoudCloud Systems Inc. and its subsidiaries do not accept >>liability for damage caused by this email or any attachments and may >>monitor email traffic. >>________________________________ > > >________________________________ >This email may contain proprietary, privileged and confidential >information and is sent for the intended recipient(s) only. If, by an >addressing or transmission error, this mail has been misdirected to you, >you are requested to notify us immediately by return email message and >delete this email and its attachments. You are also hereby notified that >any use, any form of reproduction, dissemination, copying, disclosure, >modification, distribution and/or publication of this email message, >contents or its attachment(s) other than by its intended recipient(s) is >strictly prohibited. Any opinions expressed in this email are those of >the individual and may not necessarily represent those of LoudCloud >Systems. Before opening attachment(s), please scan for viruses. It is >further notified that email transmission cannot be guaranteed to be >secure or error-free as information could be intercepted, corrupted, >lost, destroyed, arrive late or incomplete, or may contain viruses. The >sender therefore does not accept liability for any error or omission in >the contents of this message, which arise as a result of email >transmission. LoudCloud Systems Inc. and its subsidiaries do not accept >liability for damage caused by this email or any attachments and may >monitor email traffic. >________________________________