Return-Path: X-Original-To: apmail-jackrabbit-users-archive@minotaur.apache.org Delivered-To: apmail-jackrabbit-users-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 8D43F113DE for ; Sun, 17 Aug 2014 17:28:14 +0000 (UTC) Received: (qmail 62350 invoked by uid 500); 17 Aug 2014 17:28:14 -0000 Delivered-To: apmail-jackrabbit-users-archive@jackrabbit.apache.org Received: (qmail 62295 invoked by uid 500); 17 Aug 2014 17:28:14 -0000 Mailing-List: contact users-help@jackrabbit.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: users@jackrabbit.apache.org Delivered-To: mailing list users@jackrabbit.apache.org Received: (qmail 62284 invoked by uid 99); 17 Aug 2014 17:28:13 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 17 Aug 2014 17:28:13 +0000 X-ASF-Spam-Status: No, hits=-2.3 required=5.0 tests=RCVD_IN_DNSWL_MED,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of CACERESR@telefonica.net designates 217.116.26.100 as permitted sender) Received: from [217.116.26.100] (HELO relaycp04.dominioabsoluto.net) (217.116.26.100) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 17 Aug 2014 17:28:08 +0000 Received: from smtp.movistar.es (smtp10.acens.net [86.109.99.134]) by relaycp04.dominioabsoluto.net (Postfix) with ESMTP id 47B1264472; Sun, 17 Aug 2014 19:27:46 +0200 (CEST) X-CTCH-RefID: str=0001.0A0B020C.53F0E612.0026,ss=1,re=0.000,recu=0.000,reip=0.000,cl=1,cld=1,fgs=0 X-CTCH-Spam: Unknown Received: from julianPC (79.151.166.236) by smtp.movistar.es (8.6.122.03) (authenticated as CACERESR$telefonica.net) id 53EE424C000D01E5; Sun, 17 Aug 2014 17:27:46 +0000 Message-ID: From: =?iso-8859-1?Q?Juli=E1n?= To: , References: <56CC8A3A002F43FDA33147A5898E00E5@julianPC> <53E7A730.5020705@artifact-software.com> <268A2F1CDD82454CAD91CA3218512860@julianPC> <53EF9E4D.4030704@artifact-software.com> In-Reply-To: <53EF9E4D.4030704@artifact-software.com> Subject: Re: Help for a new user Date: Sun, 17 Aug 2014 19:27:45 +0200 MIME-Version: 1.0 Content-Type: text/plain; format=flowed; charset="iso-8859-1"; reply-type=response Content-Transfer-Encoding: 8bit X-Priority: 3 X-MSMail-Priority: Normal Importance: Normal X-Mailer: Microsoft Windows Live Mail 14.0.8117.416 X-MimeOLE: Produced By Microsoft MimeOLE V14.0.8117.416 X-Antivirus: avast! (VPS 140817-0, 17/08/2014), Outbound message X-Antivirus-Status: Clean X-Virus-Checked: Checked by ClamAV on apache.org Thanks again for your response Ron. It seems you're the one in the mailing list. Perhaps people are on their holidays. I'm beginning to realize that I was wrong. Because of your response, I've been looking for information and I've found Apache Lucene and Apache Tika. I have to try both, but it seems that they can work toghether for extracting and indexing files, and tika supports lots of formats. I'm considering that I don't need to use jackrabbit for my application actually. Perhaps, I only need those tools to search inside the files I want to store. I'm think I don't need a repository. I can save the properties of the files in a database, and the files in normal folders. I think it'd be pretty easy for me because I'm used to work with databases, but I've never worked with a repository. In fact, I was going to use the repository for its search capabilities, but I'm realizing that I don't need it. I'm going to try with Lucene and Tika first. Thanks a lot. -------------------------------------------------- From: "Ron Wheeler" Sent: Saturday, August 16, 2014 8:09 PM To: Subject: Re: Help for a new user > Some ideas that may be helpful. > If you want to search inside Jackrabbit using its internal search engine, > you are going to have to extract the text on the way in. > I think that this means using the appropriate tool to read the content > from the incoming document and creating a document linked to the original > that can be searched by Jackrabbit and then used to find the original PDF > or DOC or XLS, etc. to present to the user. > > This should be possible for most of the common documents since there are > Apache tools such as POI that let you read DOC and XLS files and extract > the content. > http://pdfbox.apache.org > http://poi.apache.org/ > http://www.swftools.org/ > https://wiki.openoffice.org/wiki/Xml > > This can be a reasonably general solution if you add a facility that > allows users to manually write a document summary or keyword list for > documents in formats that you do not support or that do not contain text > that describes their content or usage - CAD drawings, Quickbooks backups, > database backups, etc. > > I hope that this gives you something to think about until a real > Jackrabbit expert shows up. > > Ron > > On 16/08/2014 12:49 PM, Juli�n wrote: >> >> Hello. >> I've been able to use a repository in my JSF application at last. If >> someone has a similar problem, I can help him. >> >> Now, I would like to insert some files (.doc, .pdf, ...), and search for >> words into them, like google. >> I suposse that I'll have to use text extractors, and I'll have to >> configure the repository to index the files. >> >> Does anybody know where I can find some examples? >> Can anyone tell me where to look for? >> >> Thanks >> >> >> -------------------------------------------------- >> From: "Ron Wheeler" >> Sent: Sunday, August 10, 2014 7:09 PM >> To: >> Subject: Re: Help for a new user >> >>> Did you get the example from >>> http://jackrabbit.apache.org/first-hops.html working? >>> >>> You probably should get Eclipse working with Maven. That will get rid >>> of some of the headaches. >>> >>> If you want a fast way to get up and running with Exclipse and Maven try >>> Eclipse STS. It is an Eclipse that comes out of the box with all the >>> plug-ins that you need to develop Java applications with Maven. >>> This get rid of the need to set up software on classpaths manually. >>> >>> Once you have the first hop demo working, you should be able to make >>> your simple web app. >>> At least you will have specific log messages to talk about. >>> >>> Ron >>> >>> >>> On 10/08/2014 7:54 AM, Juli�n wrote: >>>> (sorry for my english) >>>> I'm very new at java, javaEE, web-development world, and, of course, >>>> jackrabbit environment. >>>> I'm a student and I'm working in my degree project. An "easy" document >>>> management system. >>>> I only need users to get their documents and to be able to search >>>> groups of words into them (PDF, DOC, XLS ...) like a google search. >>>> I've heard about jackrabbit's benefits, so I've decided to use it. (I >>>> suposse jackrabbit can do those task ?) >>>> >>>> I am developing an "easy" JSF application with Primefaces, Mysql... and >>>> now, I'm in the phase when I have to manage the documents. >>>> I've read the JSR 283 specification, and I undestand it more or less. >>>> My problem is how to begin. >>>> >>>> I need someone to show me a simple example to create and access a >>>> repository. The repository only have to work with my application in a >>>> tomcat server. >>>> I've been looking for information on the Internet and I'm absolutely >>>> lost. Everyone say different things. I haven't been able to find an >>>> "easy" example about I need. >>>> In Jackrabbit's web, I've been reading about deployments models, >>>> stand-alone server, Jackrabbit Web application, Jackrabbit JCA >>>> Resource Adapter ... >>>> Oh my god! Is it really so difficult what I want to do? I don't think >>>> so, perhaps I'm getting older... >>>> >>>> I only need: >>>> 1� when a client access the application for the first time, the >>>> repository will be created in a specified path. >>>> 2� Clients will upload files, search for content, and download them. >>>> >>>> I'm now in the first point. Can anyone help me? >>>> I use the eclipse IDE and I don't use maven. >>>> What "jars" must I include in my classpath? >>>> what java instructions do I need to create and set up the repository? >>>> In the JSR specification, they use the RepositoryFactory class. Is it >>>> the way to do it? >>>> >>>> Thanks a lot, and sorry for my ignorance. >>>> >>>> >>>> --- >>>> Este mensaje no contiene virus ni malware porque la protecci�n de >>>> avast! Antivirus est� activa. >>>> http://www.avast.com >>>> >>> >>> >>> -- >>> Ron Wheeler >>> President >>> Artifact Software Inc >>> email: rwheeler@artifact-software.com >>> skype: ronaldmwheeler >>> phone: 866-970-2435, ext 102 >>> >> >> --- >> Este mensaje no contiene virus ni malware porque la protecci�n de avast! >> Antivirus est� activa. >> http://www.avast.com >> >> > > > -- > Ron Wheeler > President > Artifact Software Inc > email: rwheeler@artifact-software.com > skype: ronaldmwheeler > phone: 866-970-2435, ext 102 > --- Este mensaje no contiene virus ni malware porque la protecci�n de avast! Antivirus est� activa. http://www.avast.com