oodt-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lewis John Mcgibbney <lewis.mcgibb...@gmail.com>
Subject Re: Setting up filemanager with SOLR 5.5
Date Fri, 25 Mar 2016 01:06:53 GMT
Hi Kos,
The schema field definition in the file manager schema sound like they need
a bit of an overhaul.
Are you able to file an issue against master and submit a PR?
Thanks
Lewis

On Thursday, March 24, 2016, Konstantinos Mavrommatis <
kmavrommatis@celgene.com> wrote:

> Hi,
>
> I seem to have solved the issue, although I have not tested the setup
> extensively.
>
> 1. the actual url to solr needs to contain the core name. So the correct
> url that needs to be in the filemgr.properties file is "
> http://localhost:8983/solr/oodt/" where the last part 'oodt' corresponds
> to the name of the core.
> 2. The schema.xml provided with oodt is not compatible to the classes of
> this version of SOLR. Instead I setup the solr core using the
> data_driven_schema_configs method, and copied the definitions of the CAS.*
> fields in the file managed-schema.
>
> As a result I could now ingest files in SOLR but only partially. The first
> action of adding information about a document (i.e. CAS.* fields) was
> successful.
> But the second action of updating the record with additional information
> failed with errors of the type: HTTP method failed: HTTP/1.1 400 Bad Request
>
> In the solr.log file the error message indicated that it cannot index the
> field 'FileLocation'. Note that even simple curl commands that were trying
> to update documents by adding a single field at a time  were failing with
> the same type of error. Somehow SOLR seemed incapable of adding new fields
> that were not explicitly defined in the managed-schema file.
>
> =================================================================================
> 2016-03-24 23:57:53.990 ERROR (qtp364656927-14) [   x:oodt]
> o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: undefined
> field: "FileLocation"
>         at
> org.apache.solr.schema.IndexSchema.getField(IndexSchema.java:1209)
>         at
> org.apache.solr.update.processor.AtomicUpdateDocumentMerger.doAdd(AtomicUpdateDocumentMerger.java:133)
>         at
> org.apache.solr.update.processor.AtomicUpdateDocumentMerger.merge(AtomicUpdateDocumentMerger.java:89)
>         at
> org.apache.solr.update.processor.DistributedUpdateProcessor.getUpdatedDocument(DistributedUpdateProcessor.java:1121)
>         at
> org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:1018)
>         at
> org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:709)
>         at
> org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:103)
>         at
> org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:48)
>         at
> org.apache.solr.update.processor.AbstractDefaultValueUpdateProcessorFactory$DefaultValueUpdateProcessor.processAdd(AbstractDefaultValueUpdateProcessorFactory.java:93)
>         at
> org.apache.solr.handler.loader.XMLLoader.processUpdate(XMLLoader.java:250)
>         at
> org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:177)
>         at
> org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:94)
>         at
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:69)
>         at
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:155)
>         at org.apache.solr.core.SolrCore.execute(SolrCore.java:2082)
>         at
> org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:670)
>         at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:458)
>         at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:225)
>         at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:183)
>         at
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
>         at
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
>         at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>         at
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577)
>         at
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)
>         at
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)
>         at
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
>         at
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
>         at
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061)
>         at
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>         at
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215)
>         at
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:110)
>         at
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
>         at org.eclipse.jetty.server.Server.handle(Server.java:499)
>         at
> org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:310)
>         at
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257)
>         at
> org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540)
>         at
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635)
>         at
> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555)
>         at java.lang.Thread.run(Thread.java:745)
>
> ===============================================================================
>
> To resolve it I modified the managed-schema file
> replaced the line
> <copyField source="*" dest="_text_"/> line with the following one
> <dynamicField name="*" type="text_general" indexed="true" stored="true"/>
> Now the ingestion process works without problem.
> Since I am very new to SOLR, this may not be the correct approach but for
> the time works - if somebody has a better idea please chime in.
> Thanks
> K
>
> -----Original Message-----
> From: Tom Barber [mailto:tom.barber@meteorite.bi <javascript:;>]
> Sent: Thursday, March 24, 2016 12:37 AM
> To: dev@oodt.apache.org <javascript:;>
> Subject: Re: Setting up filemanager with SOLR 5.5
>
> Ooh, I see, 5.5! Sorry. not had my morning coffee yet. Yeah I tried 5.5 a
> while back out of curiosity rather than a use case and it certainly didn't
> work OOTB.
>
> Tom
>
> On Thu, Mar 24, 2016 at 7:32 AM, Tom Barber <tom.barber@meteorite.bi
> <javascript:;>> wrote:
>
> > I appreciate its cheating, but when you build OODT with Radix did you
> > use the Solr maven profile? We fixed it up in 0.12 and it seemed to be
> > working fine, although I wasn't the one who tested it fully.
> >
> > Tom
> >
> > On Thu, Mar 24, 2016 at 7:15 AM, Konstantinos Mavrommatis <
> > kmavrommatis@celgene.com <javascript:;>> wrote:
> >
> >> Hi,
> >> I am using oodt v 0.12
> >> Interestingly the file etc/logging.properties was not in its place
> >> although this was a clean installation on a clean, newly started
> >> Ubuntu
> >> 14.04 server on AWS (!!) I copied the file from the
> >> oodt-src/filemgr/target/classes/etc/logging.properties and set the
> >> logging level.
> >> Running the same command:
> >> At the end of the file there are following error lines
> >>
> >> INFO: Posting message:<add><doc><field
> >> name="id">93addb3e-4d02-41ca-bee6-fbb321d6c890</field><field
> >> name="CAS.ProductId">93addb3e-4d02-41ca-bee6-fbb321d6c890</field><fie
> >> ld name="CAS.ProductName">test.txt</field><field
> >> name="CAS.ProductTypeName">GenericFile</field><field
> >> name="CAS.ProductReceivedTime">2016-03-24T07:12:01Z</field><field
> >> name="CAS.ProductTypeId">urn:oodt:GenericFile</field><field
> >> name="CAS.ProductStructure">Flat</field><field
> >> name="CAS.ProductTransferStatus">TRANSFERING</field></doc></add>
to URL:
> >> https://urldefense.proofpoint.com/v2/url?u=http-3A__localhost-3A8983_
> >> solr_update&d=CwIBaQ&c=CZZujK3G2KuXGEKnzq-Hrg&r=wndYZ4MLMT9l3Zb2WZv2h
> >> q2O6yvZ1Cs-T2gHY95y7ZA&m=RHpDT3nSOn4uWf0ifOCDn1AIwRXSw8CIY51DbEp5Lc8&
> >> s=7cz1N0iI0WRMpROKkDmYSpLacfEzT9fKjg4N8Oh36w0&e=
> >> Mar 24, 2016 7:12:01 AM
> >> org.apache.oodt.cas.filemgr.catalog.solr.SolrClient index
> >> SEVERE: HTTP method failed: HTTP/1.1 404 Not Found Mar 24, 2016
> >> 7:12:01 AM org.apache.oodt.cas.filemgr.system.XmlRpcFileManager
> >> catalogProduct
> >> SEVERE: ingestProduct: CatalogException when adding Product: test.txt
> >> to
> >> Catalog: Message: HTTP method failed: HTTP/1.1 404 Not Found Mar 24,
> >> 2016 7:12:01 AM org.apache.oodt.cas.filemgr.system.XmlRpcFileManager
> >> ingestProductCore
> >> SEVERE: HTTP method failed: HTTP/1.1 404 Not Found Mar 24, 2016
> >> 7:12:01 AM org.apache.oodt.cas.filemgr.system.XmlRpcFileManagerClient
> >> ingestProduct
> >> SEVERE: Failed to ingest product [ name:test.txt] :java.lang.Exception:
> >> org.apache.oodt.cas.filemgr.structs.exceptions.CatalogException:
> >> Error ingesting product
> >> [org.apache.oodt.cas.filemgr.structs.Product@60a79fbb]
> >> : HTTP method failed: HTTP/1.1 404 Not Found -- rolling back ingest
> >> Mar 24, 2016 7:12:01 AM
> >> org.apache.oodt.cas.filemgr.catalog.solr.SolrClient delete
> >> INFO: Posting message:<delete><query>id:null</query></delete>
to URL:
> >> https://urldefense.proofpoint.com/v2/url?u=http-3A__localhost-3A8983_
> >> solr_update-3Fcommit-3Dtrue&d=CwIBaQ&c=CZZujK3G2KuXGEKnzq-Hrg&r=wndYZ
> >> 4MLMT9l3Zb2WZv2hq2O6yvZ1Cs-T2gHY95y7ZA&m=RHpDT3nSOn4uWf0ifOCDn1AIwRXS
> >> w8CIY51DbEp5Lc8&s=jITvZPImZDY7_yKA18cyMffYS67ahR9Wz9IN9P1UNz0&e=
> >> Mar 24, 2016 7:12:01 AM
> >> org.apache.oodt.cas.filemgr.catalog.solr.SolrClient delete
> >> SEVERE: HTTP method failed: HTTP/1.1 404 Not Found Mar 24, 2016
> >> 7:12:01 AM org.apache.oodt.cas.filemgr.system.XmlRpcFileManager
> >> removeProduct
> >> WARNING: Exception modifying product: [null]: Message: HTTP method
> >> failed: HTTP/1.1 404 Not Found
> >> org.apache.oodt.cas.filemgr.structs.exceptions.CatalogException: HTTP
> >> method failed: HTTP/1.1 404 Not Found
> >>         at
> >>
> org.apache.oodt.cas.filemgr.catalog.solr.SolrClient.delete(SolrClient.java:130)
> >>         at
> >>
> org.apache.oodt.cas.filemgr.catalog.solr.SolrCatalog.removeProduct(SolrCatalog.java:165)
> >>         at
> >>
> org.apache.oodt.cas.filemgr.system.XmlRpcFileManager.removeProduct(XmlRpcFileManager.java:1113)
> >>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >>         at
> >>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> >>         at
> >>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> >>         at java.lang.reflect.Method.invoke(Method.java:606)
> >>         at org.apache.xmlrpc.Invoker.execute(Invoker.java:130)
> >>         at
> >> org.apache.xmlrpc.XmlRpcWorker.invokeHandler(XmlRpcWorker.java:84)
> >>         at org.apache.xmlrpc.XmlRpcWorker.execute(XmlRpcWorker.java:146)
> >>         at org.apache.xmlrpc.XmlRpcServer.execute(XmlRpcServer.java:139)
> >>         at org.apache.xmlrpc.XmlRpcServer.execute(XmlRpcServer.java:125)
> >>         at
> org.apache.xmlrpc.WebServer$Connection.run(WebServer.java:761)
> >>         at org.apache.xmlrpc.WebServer$Runner.run(WebServer.java:642)
> >>         at java.lang.Thread.run(Thread.java:745)
> >>
> >> Mar 24, 2016 7:12:01 AM
> >> org.apache.oodt.cas.filemgr.system.XmlRpcFileManagerClient
> >> ingestProduct
> >> SEVERE: Failed to rollback ingest of product
> >> [org.apache.oodt.cas.filemgr.structs.Product@2782f72b] :
> >> java.lang.Exception:
> >> org.apache.oodt.cas.filemgr.structs.exceptions.CatalogException:
> >> Error ingesting product
> >> [org.apache.oodt.cas.filemgr.structs.Product@60a79fbb]
> >> : HTTP method failed: HTTP/1.1 404 Not Found Mar 24, 2016 7:12:01 AM
> >> org.apache.oodt.cas.filemgr.cli.action.IngestProductCliAction execute
> >> SEVERE: java.lang.Exception:
> >> org.apache.oodt.cas.filemgr.structs.exceptions.CatalogException:
> >> Error ingesting product
> >> [org.apache.oodt.cas.filemgr.structs.Product@60a79fbb]
> >> : HTTP method failed: HTTP/1.1 404 Not Found
> >> ERROR: Failed to ingest product 'test.txt' : java.lang.Exception:
> >> org.apache.oodt.cas.filemgr.structs.exceptions.CatalogException:
> >> Error ingesting product
> >> [org.apache.oodt.cas.filemgr.structs.Product@60a79fbb]
> >> : HTTP method failed: HTTP/1.1 404 Not Found-----Original
> >> Message-----
> >>
> >>
> >>
> >> From: Lewis John Mcgibbney [mailto:lewis.mcgibbney@gmail.com
> <javascript:;>]
> >> Sent: Wednesday, March 23, 2016 7:25 PM
> >> To: dev@oodt.apache.org <javascript:;>
> >> Subject: Re: Setting up filemanager with SOLR 5.5
> >>
> >> Hi K,
> >> Which version of OODT are you using?
> >> Can you set logging to debug, restart your filemanager instance and
> >> tail the log for any more clues. If you get any more clues then post
> them here.
> >> Also, if you are running off of master branch then this is an
> >> excellent opportunity for us to improve the error message, printing
> >> the product ID as oppose the opaque object... the later is relatively
> useless.
> >>
> >>
> >>
> >> On Wed, Mar 23, 2016 at 6:48 PM, Konstantinos Mavrommatis <
> >> kmavrommatis@celgene.com <javascript:;>> wrote:
> >>
> >> > Hi,
> >> > I have setup oodt using RADiX.
> >> >
> >> > When I use the default Lucene catalog factory I manage to ingest a
> >> > file with no problem:
> >> > # ./filemgr-client --url
> >> > https://urldefense.proofpoint.com/v2/url?u=http-3A__localhost-3A900
> >> > 0&d
> >> > =CwIBaQ&c=CZZujK3G2KuXGEKnzq-Hrg&r=wndYZ4MLMT9l3Zb2WZv2hq2O6yvZ1Cs-
> >> > T2g
> >> > HY95y7ZA&m=yypXYFp0GDrsimQXj-42kCPwJnfRFvCIjjJKzPQVqSw&s=we1KbELso6
> >> > G5c QKnuCw6lUZ-uHRh9J4QgzUSy_xWj1A&e=  --operation --ingestProduct
> >> > --productName test.txt --productStructure Flat --productTypeName
> >> > GenericFile --metadataFile file:///tmp/test.txt.met --refs
> >> > file:///tmp/test.txt
> >> > ingestProduct: Result: afda62a1-f15f-11e5-a7d4-7d8cde2ab6dd
> >> >
> >> >
> >> > Then I install solr 5.5/jetty and created a new core named oodt
> >> > #solr create_core -c oodt I run the example command in solr
> >> > documentation and verified that this instance is able to index
> >> > documents. I can also access solr dashboard at the following URL:
> >> > https://urldefense.proofpoint.com/v2/url?u=http-3A__localhost-3A898
> >> > 3_s
> >> > olr&d=CwIBaQ&c=CZZujK3G2KuXGEKnzq-Hrg&r=wndYZ4MLMT9l3Zb2WZv2hq2O6yv
> >> > Z1C
> >> > s-T2gHY95y7ZA&m=yypXYFp0GDrsimQXj-42kCPwJnfRFvCIjjJKzPQVqSw&s=cuARM
> >> > uJC 3xzy3KYRi1OQeZ7vboagVHfSOb7cYjedhfo&e=
> >> > I also modified the filemgr.properties file with the following lines:
> >> >
> >> > filemgr.catalog.factory=org.apache.oodt.cas.filemgr.catalog.solr.So
> >> > lrC
> >> > atalogFactory
> >> > org.apache.oodt.cas.filemgr.catalog.solr.url=https://urldefense.pro
> >> > ofpoint.com/v2/url?u=https-3A__urldefense.proofp&d=CwIBaQ&c=CZZujK3
> >> > G2KuXGEKnzq-Hrg&r=wndYZ4MLMT9l3Zb2WZv2hq2O6yvZ1Cs-T2gHY95y7ZA&m=RHp
> >> > DT3nSOn4uWf0ifOCDn1AIwRXSw8CIY51DbEp5Lc8&s=d5QpgHP8PkmNWprxGQlAETnn
> >> > GvhxDGfMjMPTXoc7oPI&e=
> >> > oint.com/v2/url?u=http-3A__localhost-3A8983_solr&d=CwIBaQ&c=CZZujK3
> >> > G2K
> >> > uXGEKnzq-Hrg&r=wndYZ4MLMT9l3Zb2WZv2hq2O6yvZ1Cs-T2gHY95y7ZA&m=yypXYF
> >> > p0G
> >> > DrsimQXj-42kCPwJnfRFvCIjjJKzPQVqSw&s=cuARMuJC3xzy3KYRi1OQeZ7vboagVH
> >> > fSO
> >> > b7cYjedhfo&e=
> >> >
> >> > I have tried to run this command with and without the schema.xml in
> >> > the core directory. In both cases I get the error:
> >> >
> >> > But now when I try to ingest a file I get:
> >> > # ./filemgr-client --url
> >> > https://urldefense.proofpoint.com/v2/url?u=http-3A__localhost-3A900
> >> > 0&d
> >> > =CwIBaQ&c=CZZujK3G2KuXGEKnzq-Hrg&r=wndYZ4MLMT9l3Zb2WZv2hq2O6yvZ1Cs-
> >> > T2g
> >> > HY95y7ZA&m=yypXYFp0GDrsimQXj-42kCPwJnfRFvCIjjJKzPQVqSw&s=we1KbELso6
> >> > G5c QKnuCw6lUZ-uHRh9J4QgzUSy_xWj1A&e=  --operation --ingestProduct
> >> > --productName test.txt --productStructure Flat --productTypeName
> >> > GenericFile --metadataFile file:///tmp/test.txt.met --refs
> >> > file:///tmp/test.txt
> >> > ERROR: Failed to ingest product 'test.txt' : java.lang.Exception:
> >> > org.apache.oodt.cas.filemgr.structs.exceptions.CatalogException:
> >> > Error ingesting product
> >> > [org.apache.oodt.cas.filemgr.structs.Product@5eebbe]
> >> :
> >> > HTTP method failed: HTTP/1.1 404 Not Found
> >> >
> >> > Any ideas what is going on?
> >> >
> >> > Thanks
> >> > K
> >> >
> >> > *********************************************************
> >> > THIS ELECTRONIC MAIL MESSAGE AND ANY ATTACHMENT IS CONFIDENTIAL AND
> >> > MAY CONTAIN LEGALLY PRIVILEGED INFORMATION INTENDED ONLY FOR THE
> >> > USE OF THE INDIVIDUAL OR INDIVIDUALS NAMED ABOVE.
> >> > If the reader is not the intended recipient, or the employee or
> >> > agent responsible to deliver it to the intended recipient, you are
> >> > hereby notified that any dissemination, distribution or copying of
> >> > this communication is strictly prohibited. If you have received
> >> > this communication in error, please reply to the sender to notify
> >> > us of the error and delete the original message. Thank You.
> >> >
> >>
> >>
> >>
> >> --
> >> *Lewis*
> >>
> >> *********************************************************
> >> THIS ELECTRONIC MAIL MESSAGE AND ANY ATTACHMENT IS CONFIDENTIAL AND
> >> MAY CONTAIN LEGALLY PRIVILEGED INFORMATION INTENDED ONLY FOR THE USE
> >> OF THE INDIVIDUAL OR INDIVIDUALS NAMED ABOVE.
> >> If the reader is not the intended recipient, or the employee or agent
> >> responsible to deliver it to the intended recipient, you are hereby
> >> notified that any dissemination, distribution or copying of this
> >> communication is strictly prohibited. If you have received this
> >> communication in error, please reply to the sender to notify us of
> >> the error and delete the original message. Thank You.
> >>
> >
> >
>
> *********************************************************
> THIS ELECTRONIC MAIL MESSAGE AND ANY ATTACHMENT IS
> CONFIDENTIAL AND MAY CONTAIN LEGALLY PRIVILEGED
> INFORMATION INTENDED ONLY FOR THE USE OF THE INDIVIDUAL
> OR INDIVIDUALS NAMED ABOVE.
> If the reader is not the intended recipient, or the
> employee or agent responsible to deliver it to the
> intended recipient, you are hereby notified that any
> dissemination, distribution or copying of this
> communication is strictly prohibited. If you have
> received this communication in error, please reply to the
> sender to notify us of the error and delete the original
> message. Thank You.
>


-- 
*Lewis*

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message