manifoldcf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Salih Sen <sa...@dilisim.com>
Subject Metadata fields get lost in 1.7.2 with Sharepoint 2013 repository and Solr output connection
Date Thu, 08 Jan 2015 16:13:48 GMT
Hi,

We've noticed that metadata of some documents aren't indexed in Solr.

I tried tracking down to issue in source code and noticed that
RepositoryDocument
has around 25 fields until it reaches the RepositoryDocumentFactory.
​ ​
Document that returned from
​ ​
factory.createDocument()
​ ​
has only a single field in IncrementalIngester.java line 3089.



I couldn't get the logic behind if (iter.hasNext()) in the code below while
it has twenty something fields it "iterates" on only the first one.
Is is the expected behaviour?

A similar code also exist in createDocument() method so I feel I might be
looking at the wrong places but as far as I can see this part creates the
difference between the document comes from Sharepoint repository and the
one posted to Solr.

Thanks.


RepositoryDocumentFactory.java
---------------------------------​------------

public RepositoryDocumentFactory(RepositoryDocument document)
  throws ManifoldCFException, IOException
{
  this.original = document;

  try
  {
    this.binaryTracker = new TempFileInput(document.getBinaryStream());
    // Copy all reader streams
    Iterator<String> iter = document.getFields();
    if (iter.hasNext())
    {
      String fieldName = iter.next();
      Object[] objects = document.getField(fieldName);
      if (objects instanceof Reader[])
      {
        CharacterInput[] newValues = new CharacterInput[objects.length];
        metadataReaders.put(fieldName,newValues);
        // Populate newValues
        for (int i = 0; i < newValues.length; i++)
        {
          newValues[i] = new TempFileCharacterInput((Reader)objects[i]);
        }
      }
    }
  }
  catch (Throwable e)
  {
    // Clean up everything we've done so far.
    if (this.binaryTracker != null)
      this.binaryTracker.discard();
    for (String key : metadataReaders.keySet())
    {
      CharacterInput[] rt = metadataReaders.get(key);
      for (CharacterInput r : rt)
      {
        if (r != null)
          r.discard();
      }
    }
    if (e instanceof IOException)
      throw (IOException)e;
    else if (e instanceof RuntimeException)
      throw (RuntimeException)e;
    else if (e instanceof Error)
      throw (Error)e;
    else
      throw new RuntimeException("Unknown exception type:
"+e.getClass().getName()+": "+e.getMessage(),e);
  }
}



--

Salih Şen

Dilişim Bilgi Bilgisayar ve İletişim Teknolojileri Sanayi ve Ticaret Ltd.
Sti.

email: salih@dilisim.com

Tel: 0 222 330 20 21

GSM: 0 507 296 15 51

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message