manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From c.a....@gmx.de
Subject Re: RE: Beginner's question
Date Mon, 26 Jul 2010 05:45:17 GMT
Hi,

thanks a lot for fixing it. :)
When starting the job I receive a NPE in the lcf-logfiles.
-----------------------------------
[Startup thread] FATAL org.apache.lcf.crawlerthreads - Error tossed: null
java.lang.NullPointerException
    at java.io.StringReader.<init>(StringReader.java:33)
    at 
org.apache.lcf.crawler.connectors.webcrawler.WebcrawlerConnector.stringToArray(WebcrawlerConnector.java:6681)
    at 
org.apache.lcf.crawler.connectors.webcrawler.WebcrawlerConnector$DocumentURLFilter.<init>(WebcrawlerConnector.java:7158)
    at 
org.apache.lcf.crawler.connectors.webcrawler.WebcrawlerConnector.addSeedDocuments(WebcrawlerConnector.java:460)
    at 
org.apache.lcf.crawler.connectors.BaseRepositoryConnector.addSeedDocuments(BaseRepositoryConnector.java:243)
    at 
org.apache.lcf.crawler.system.StartupThread.run(StartupThread.java:184)
-----------------------------------
The seed I entered was sth like "www.apache.org" or 
"http://www.apache.org".

And some minor probs: after creating a webcrawler job and clicking "View" 
in the "List all Jobs" tab or "Save" after having selected the "Edit" 
dialog I still receive an empty screen.

Carina

>         
> -------- Original-Nachricht --------
> Datum: Fri, 23 Jul 2010 14:52:23 +0200
> Von: karl.wright@nokia.com
> An: connectors-user@incubator.apache.org
> Betreff: RE: Beginner's question
> 
>                 
> Done.  r967081.        
> 
>                  Karl        
> 
>                                                           From: Wright 
> Karl (Nokia-MS/Cambridge) 
> Sent: Friday, July 23, 2010 8:39 AM
> To: connectors-user@incubator.apache.org
> Subject: RE: Beginner's question        
> 
>                                                             It 
> appears that work done for the API inadvertently broke the web connector 
> UI.  I&rsquo;ll check a fix shortly.        
> 
>                                           Karl        
> 
>                                                           From: Wright 
> Karl (Nokia-MS/Cambridge) 
> Sent: Friday, July 23, 2010 8:32 AM
> To: connectors-user@incubator.apache.org
> Subject: RE: Beginner's question        
> 
>                                                             Your 
> configuration looks reasonable.  Do you see any stack traces in either 
> the LCF log, or the tomcat log?        
> 
>                                           I&rsquo;ll try the same thing 
> here and see what happens.        
> 
>                                           Karl        
> 
>                                                                        
>             From: ext c.a.r.e@gmx.de [mailto:c.a.r.e@gmx.de] 
> Sent: Friday, July 23, 2010 8:27 AM
> To: connectors-user@incubator.apache.org
> Subject: Re: Beginner's question        
> 
>                                                             Hi,       
>  
> 
>                                              I'm still having the 
> problem I explained below:        
> 
>                   When I create a new job choosing a web connector I 
> receive an empty screen when clicking on one of the other tabs 
> (Scheduling etc.).        
> 
>                   When selecting a Filesys Connector everything works 
> fine.        
> 
>                                                                      
> I think I might have an error in my web connector configuration.        
> 
>                                                                     
>                                      Name:Web Con Description:         
> 
>                              Connection type:Web Connector Max 
> connections:10 Authority:None (global authority)         
> 
>                              Throttling:         
> 
>                                                                         
>               Bin regular expression                    
> 
>                                                                         
>            Description                    
> 
>                                                                         
>            Max avg fetches/min                    
> 
>                                                            
>                                                         
> No throttles                    
> 
>                                                            
>                                        
>                                                                     Email 
> address:                    
> 
>                                                                         
>            mail@example.org                    
> 
>                                                                         
>            Robots usage:                    
> 
>                                                                         
>            Obey robots.txt for all fetches                    
> 
>                                                            
>                                                         
> Bandwidth throttling:                    
> 
>                                                                         
>                                                                           
>                                                      Bin regular 
> expression                                
> 
>                                                                         
>                                                            Case 
> insensitive?                                
> 
>                                                                         
>                                                            Max 
> connections                                
> 
>                                                                         
>                                                            Max kbytes/sec 
>                                
> 
>                                                                         
>                                                            Max 
> fetches/min                                
> 
>                                                                         
>                        
>                                                                           
>                   
> No bandwidth throttling                                
> 
>                                                                         
>                        
>                                                                           
>        
>                                                         
> Page access credentials:                    
> 
>                                                                         
>                                                                           
>                                                      URL regular 
> expression                                
> 
>                                                                         
>                                                            Credential 
> type                                
> 
>                                                                         
>                                                            Credential 
> domain                                
> 
>                                                                         
>                                                            User name      
>                           
> 
>                                                                         
>                        
>                                                                           
>                   
> No page access credentials                                
> 
>                                                                         
>                        
>                                                                           
>        
>                                                         
> Session-based access credentials:                    
> 
>                                                                         
>                                                                           
>                                                      URL regular 
> expression                                
> 
>                                                                         
>                                                            Login pages    
>                             
> 
>                                                                         
>                        
>                                                                           
>                   
> No session-based access credentials                                
> 
>                                                                         
>                        
>                                                                           
>        
>                                                         
> Trust certificates:                    
> 
>                                                                         
>                                                                           
>                                                      URL regular 
> expression                                
> 
>                                                                         
>                                                            Certificate    
>                             
> 
>                                                                         
>                        
>                                                                           
>                   
> No trust certificates                                
> 
>                                                                         
>                        
>                                                                           
>        
>                                        
> Connection status:Connection working
> 
> Any ideas?
>  Carina        
> 
>                           
> >         -------- Original-Nachricht --------
> >  Datum: Wed, 21 Jul 2010 16:04:10 +0200
> >  Von: Marc Emery <marco.emery@gmail.com>
> >  An: connectors-user@incubator.apache.org
> >  Betreff: Re: Beginner's question        
> > 
> >                   Hi,
> >  It works, thanks a lot.
> > 
> >  Cheers        
> > 
> >                           2010/7/21 <karl.wright@nokia.com>        
> > 
> >                   Code has just been checked in which fixes this 
> > subtle but nasty bug.
> > 
> >  Let me know what happens now. ;-)
> >  Karl        
> > 
> >                           
> > 
> >  -----Original Message-----
> >  From: Wright Karl (Nokia-MS/Cambridge)
> >  Sent: Wednesday, July 21, 2010 8:50 AM
> >  To: connectors-user@incubator.apache.org        
> > 
> >                                                                    
> >                   Subject: RE: Beginner's question
> > 
> >  Well, that explains why your test isn't succeeding.
> > 
> >  I think I've found the cause of the problem, however.  It is *indeed* 
> > the language default used by Derby.  The following code is the 
> > problem:
> > 
> >  >>>>>>
> >   protected LCFException reinterpretException(LCFException 
> > theException)
> >   {
> >     if (Logging.db.isDebugEnabled())
> >       Logging.db.debug("Reinterpreting exception 
> > '"+theException.getMessage()+"'.  The exception type is 
> > "+Integer.toString(theException.getErrorCode()));
> >     if (theException.getErrorCode() != 
> > LCFException.DATABASE_CONNECTION_ERROR)
> >       return theException;
> >     Throwable e = theException.getCause();
> >     if (!(e instanceof java.sql.SQLException))
> >       return theException;
> >     if (Logging.db.isDebugEnabled())
> >       Logging.db.debug("Exception "+theException.getMessage()+" is 
> > possibly a transaction abort signal");
> >     String message = e.getMessage();
> >     if (message.indexOf("due to a deadlock") != -1)
> >       return new 
> > LCFException(message,e,LCFException.DATABASE_TRANSACTION_ABORT);
> >     // Note well: We also have to treat 'duplicate key' as a 
> > transaction abort, since this is what you get when two threads attempt 
> > to
> >     // insert the same row.  (Everything only works, then, as long 
> > as there is a unique constraint corresponding to every bad insert that
> >     // one could make.)
> >     if (message.indexOf("duplicate key") != -1)
> >       return new 
> > LCFException(message,e,LCFException.DATABASE_TRANSACTION_ABORT);
> >     if (Logging.db.isDebugEnabled())
> >       Logging.db.debug("Exception "+theException.getMessage()+" is 
> > NOT a transaction abort signal");
> >     return theException;
> >   }
> >  <<<<<<
> > 
> >  It looks like Derby has a specific exception class instead for these 
> > kinds of exceptions, so I will be able to test them directly rather 
> > than look at text.  Stay tuned.
> > 
> >  Karl
> > 
> > 
> > 
> > 
> >  -----Original Message-----
> >  From: ext c.a.r.e@gmx.de [mailto:c.a.r.e@gmx.de]
> >  Sent: Wednesday, July 21, 2010 8:25 AM
> >  To: connectors-user@incubator.apache.org
> >  Subject: Re: Beginner's question
> > 
> >  Hi,
> > 
> >  I'm getting the same exception as Marc except that on my machine it's 
> > German text ;o)
> >  I tried it first with jdk 1.6_13, then updated to 1.6_21 based on a 
> > new SVN Update. But I haven't been successful yet.
> > 
> >  Carina
> > 
> > 
> >  -------- Original-Nachricht --------
> >  > Datum: Wed, 21 Jul 2010 12:13:22 +0200
> >  > Von: karl.wright@nokia.com
> >  > An: connectors-user@incubator.apache.org
> >  > Betreff: Re: Beginner\'s question
> > 
> >  > I'm definitely not seeing this behavior here, with sun jdk 1.6. 
> >  It's
> >  > worth getting to the bottom of.
> >  >
> >  > Can you do the following:
> >  >
> >  > (1)     Svn co a completely fresh version of LCF
> >  > (2)     Ant, making sure ant is actually using jdk 1.6
> >  >
> >  > If you *still* get this problem, please let me know.  It's not 
> > clear what
> >  > the difference is, but there's got to be a difference somewhere.  I 
> > hope it
> >  > is not how Derby works on French machines. ;-)
> >  >
> >  > Karl
> >  >
> >  >
> >  > >>>>>>
> >  > Worker thread aborting and restarting due to database connection 
> > reset:
> >  > Database exception: Exception doing query: L'instruction a été 
> > abandonnée
> >  > parce qu'elle aurait entraîné la duplication d'une valeur de clé 
> > dans
> >  > une contrainte de clé ou d'index unique identifié par 
> > 'I1279701064805'
> >  > définie sur 'INGESTSTATUS'.
> >  > org.apache.lcf.core.interfaces.LCFException: Database exception: 
> > Exception
> >  > doing query: L'instruction a été abandonnée parce qu'elle aurait
> >  > entraîné la duplication d'une valeur de clé dans une contrainte 
> > de clé ou
> >  > d'index unique identifié par 'I1279701064805' définie sur 
> > 'INGESTSTATUS'.
> >  >     at
> >  > 
> > org.apache.lcf.core.database.Database.executeViaThread(Database.java:421)
> >  >     at
> >  > 
> > org.apache.lcf.core.database.Database.executeUncachedQuery(Database.java:449)
> >  >     at
> >  > 
> > org.apache.lcf.core.database.Database$QueryCacheExecutor.create(Database.java:1072)
> >  >     at
> >  > 
> > org.apache.lcf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:144)
> >  >     at
> >  > 
> > org.apache.lcf.core.database.Database.executeQuery(Database.java:167)
> >  >     at
> >  > 
> > org.apache.lcf.core.database.DBInterfaceDerby.performModification(DBInterfaceDerby.java:615)
> >  >     at
> >  > 
> > org.apache.lcf.core.database.DBInterfaceDerby.performInsert(DBInterfaceDerby.java:177)
> >  >     at
> >  > 
> > org.apache.lcf.core.database.BaseTable.performInsert(BaseTable.java:76)
> >  >     at
> >  > 
> > org.apache.lcf.agents.incrementalingest.IncrementalIngester.noteDocumentIngest(IncrementalIngester.java:1267)
> >  >     at
> >  > 
> > org.apache.lcf.agents.incrementalingest.IncrementalIngester.performIngestion(IncrementalIngester.java:410)
> >  >     at
> >  > 
> > org.apache.lcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:304)
> >  >     at
> >  > 
> > org.apache.lcf.crawler.system.WorkerThread$ProcessActivity.ingestDocument(WorkerThread.java:1586)
> >  >     at
> >  > 
> > org.apache.lcf.crawler.connectors.filesystem.FileConnector.processDocuments(FileConnector.java:275)
> >  >     at
> >  > 
> > org.apache.lcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:516)
> >  >     at
> >  > 
> > org.apache.lcf.crawler.system.WorkerThread.run(WorkerThread.java:585)
> >  > Caused by: java.sql.SQLIntegrityConstraintViolationException:
> >  > L'instruction a été abandonnée parce qu'elle aurait entraîné la 
> > duplication d'une
> >  > valeur de clé dans une contrainte de clé ou d'index unique 
> > identifié par
> >  > 'I1279701064805' définie sur 'INGESTSTATUS'.
> >  >     at
> >  > 
> > org.apache.derby.impl.jdbc.SQLExceptionFactory40.getSQLException(Unknown 
> > Source)
> >  >     at 
> > org.apache.derby.impl.jdbc.Util.generateCsSQLException(Unknown
> >  > Source)
> >  >     at
> >  > 
> > org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown 
> > Source)
> >  >     at
> >  > 
> > org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(Unknown 
> > Source)
> >  >     at 
> > org.apache.derby.impl.jdbc.EmbedConnection.handleException(Unknown
> >  > Source)
> >  >     at 
> > org.apache.derby.impl.jdbc.ConnectionChild.handleException(Unknown
> >  > Source)
> >  >     at 
> > org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(Unknown
> >  > Source)
> >  >     at
> >  > 
> > org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeStatement(Unknown 
> > Source)
> >  >     at
> >  > 
> > org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeUpdate(Unknown 
> > Source)
> >  >     at 
> > org.apache.lcf.core.database.Database.execute(Database.java:566)
> >  >     at
> >  > 
> > org.apache.lcf.core.database.Database$ExecuteQueryThread.run(Database.java:381)
> >  > Caused by: java.sql.SQLException: L'instruction a été abandonnée 
> > parce
> >  > qu'elle aurait entraîné la duplication d'une valeur de clé dans 
> > une
> >  > contrainte de clé ou d'index unique identifié par 'I1279701064805' 
> > définie
> >  > sur 'INGESTSTATUS'.
> >  >     at
> >  > 
> > org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown 
> > Source)
> >  >     at
> >  > 
> > org.apache.derby.impl.jdbc.SQLExceptionFactory40.wrapArgsForTransportAcrossDRDA(Unknown

> > Source)
> >  >     ... 11 more
> >  >
> >  > However i can start jetty and get the ui working.
> >  >
> >  > Thanks
> >  > marc
> >  > <<<<<<
> >  >
> >  >
> > 
> >  --
> >  GMX DSL: Internet-, Telefon- und Handy-Flat ab 19,99 EUR/mtl.
> >  Bis zu 150 EUR Startguthaben inklusive! 
> > http://portal.gmx.net/de/go/dsl        
> > 
> >                                                                    
> >   
>                 
> 
> 
>  -- 
>  Neu: GMX De-Mail - Einfach wie E-Mail, sicher wie ein Brief! 
>  Jetzt De-Mail-Adresse reservieren: http://portal.gmx.net/de/go/demail    
>     
> 
>                                   
    
-- 
GRATIS für alle GMX-Mitglieder: Die maxdome Movie-FLAT!
Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01

Mime
View raw message