lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jason Gerlowski <gerlowsk...@gmail.com>
Subject Re: Making Solr Indexing Errors Visible
Date Sun, 30 Sep 2018 18:54:55 GMT
Hi

Also worth mentioning that bin/post only handles certain file
extensions, and AFAIR it doesn't mention specifically when it skips
over a file because of the extension. You mentioned you're trying to
index Word docs and pdf's.  Are there any other formats in the
directory that might be messing up your counts?

I also second Shawn's suggestion that you post the "bin/post" output
and a directory listing.  Additionally, if you're able to clean up the
output a bit, you might be able to diff the two lists of files and see
if the ones missing have anything particular in common.

Good luck,

Jason
On Thu, Sep 27, 2018 at 9:58 AM Shawn Heisey <apache@elyograg.org> wrote:
>
> On 9/26/2018 2:39 PM, Terry Steichen wrote:
> > Let me try to clarify a bit - I'm just using bin/post to index the files
> > in a directory.  That indexing process produces a lengthy screen display
> > of files that were indexed.  (I realize this isn't production-quality,
> > but I'm not ready for production just yet, so that should be OK.)
>
> I see a previous message on the list from you indicating solr 6.6.0.
> FYI there are five bugfix releases after 6.6.0 -- the latest 6.x release
> is 6.6.5.  I don't see any fixes related to the post tool, but maybe one
> of the problems that did get fixed might help your server behave better.
>
> Switching my source checkout to the 6.6.0 tag and checking that version...
>
> Each time a file is sent, you should get a log line starting with
> "POSTing file".
>
> The error detection in SimplePostTool has a bunch of parts.  It seems
> that *most* errors will abort the tool entirely, skipping any files that
> have not yet been processed, and logging a message with "FATAL" included.
>
> Can you show us a directory listing and all the output that you get from
> bin/post when processing that directory?
>
> Thanks,
> Shawn
>

Mime
View raw message