lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matias Alonso <matiasgalo...@gmail.com>
Subject Re: email - DIH
Date Tue, 22 Mar 2011 16:08:14 GMT
Thank you very much for your answer Erick.


My apologies for the previous email; my problem is that I don´t speak
English very well and I´m new in the world of mailing list.


The problem is that I´m indexing emails throw Data import Handler using
Gmail with imaps; I do this for search on email list in the future. The
emails are indexed partiality and I can´t found the problem of why don´t
index all of the emails.



Below I show you de configuration of my DIH.


<dataConfig>

                <document>

                               <entity

                                               name="gmail"


processor="MailEntityProcessor"

                                               transformer="LogTransformer"

                                               user="email@gmail.com"

                                               password="password"

                                               host="imap.gmail.com"

                                               protocol="imaps"

                                               fetchMailsSince="2010-01-01
00:00:00"

                                               folders="inbox"

                                               deltaFetch="false"

                                               processAttachement="false"

                                               batchSize="100"

                                               fetchSize="1024"

                                               recurse="true" />

                </document>

</dataConfig>



The date of my emails is later to “2010-01-01 00:00:00”.




I´ve done a full import and no errors were found, but in the status I saw
that was added 28 documents, and in the console, I found 35 messanges.

Below I show you the status screen, first, and then part of the console
output.



Status:

<response>

<lst name="responseHeader">

<int name="status">0</int>

<int name="QTime">1</int>

</lst>

<lst name="initArgs">

<lst name="defaults">

<str name="config">data-config.xml</str>

</lst>

</lst>

<str name="command">status</str>

<str name="status">idle</str>

<str name="importResponse"/>

<lst name="statusMessages">

<str name="Total Requests made to DataSource">0</str>

<str name="Total Rows Fetched">28</str>

<str name="Total Documents Skipped">0</str>

<str name="Full Dump Started">2011-03-22 15:55:12</str>

<str name="">

Indexing completed. Added/Updated: 28 documents. Deleted 0 documents.

</str>

<str name="Committed">2011-03-22 15:55:20</str>

<str name="Optimized">2011-03-22 15:55:20</str>

<str name="Total Documents Processed">28</str>

<str name="Time taken ">0:0:8.520</str>

</lst>

<str name="WARNING">

This response format is experimental.  It is likely to change in the future.

</str>

</response>



…”

Mar 22, 2011 3:55:14 PM
org.apache.solr.handler.dataimport.MailEntityProcessor connectToMailBox

INFO: Connected to mailbox

Mar 22, 2011 3:55:15 PM
org.apache.solr.handler.dataimport.MailEntityProcessor$FolderIterator next

INFO: Opened folder : inbox

Mar 22, 2011 3:55:15 PM
org.apache.solr.handler.dataimport.MailEntityProcessor$FolderIterator next

INFO: Added its children to list  :

Mar 22, 2011 3:55:15 PM
org.apache.solr.handler.dataimport.MailEntityProcessor$FolderIterator next

INFO: NO children :

Mar 22, 2011 3:55:16 PM
org.apache.solr.handler.dataimport.MailEntityProcessor$MessageIterator
<init>

INFO: Total messages : 35

Mar 22, 2011 3:55:16 PM
org.apache.solr.handler.dataimport.MailEntityProcessor$MessageIterator
<init>

INFO: Search criteria applied. Batching disabled

Mar 22, 2011 3:55:19 PM org.apache.solr.handler.dataimport.DocBuilder finish

INFO: Import completed successfully

“…



Regards,

Matias.





2011/3/22 Erick Erickson <erickerickson@gmail.com>

> Not unless you provide a lot more data. Have you
> inspected the Solr logs and seen any anomalies?
>
> Please review:
> http://wiki.apache.org/solr/UsingMailingLists
>
> Best
> Erick
>
> On Mon, Mar 21, 2011 at 3:56 PM, Matias Alonso <matiasgalonso@gmail.com>
> wrote:
> > Hi,
> >
> >
> > I’m using Data Import Handler for index emails.
> >
> > The problem is that nota ll the emails was indexed When I do a full
> import.
> >
> > Someone have any idea?
> >
> >
> > Regards,
> >
> > --
> > Matias.
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message