manifoldcf-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jitu <abj...@gmail.com>
Subject Re: schedule information
Date Mon, 22 Dec 2014 13:14:58 GMT
Hi Karl,

Thanks for the quick reply and support. This is exactly what i was looking
for. Thank you so much. If i modify WorkerThread.java do i need to submit a
patch for the same?

Thanks,
Jitu

On Mon, Dec 22, 2014 at 4:12 PM, Karl Wright <daddywri@gmail.com> wrote:

> Hi Jitu,
>
> I'm sorry for the miscommunication.  What I meant is that without any
> modifications, you can add the job's name as metadata for all documents
> indexed with the job.
>
> If you need to index hard-wired metadata for every job run, you will need
> to modify WorkerThread.java.  The IJobDescription object is readily
> available there, but you will also need to write a SQL query to obtain the
> job's start time.
>
> Karl
>
>
> On Mon, Dec 22, 2014 at 4:33 AM, Jitu <abjitu@gmail.com> wrote:
>
>> Hi Karl,
>>           Thanks for the quick reply and support. i have gone through the
>> source code of "ForcedMetadataConnector.java" as well as  end user document
>> "
>> http://manifoldcf.apache.org/release/trunk/en_US/end-user-documentation.html#metadataadjuster".
>> It says we can add a string constant for every job run. but for my client
>> requirement he wants to know what all files crawled for every run of the
>> job. so to search that i need to a send unique id of every job run as part
>> of metadata. this unique id changes for every job run so i cannot use
>> ForcedMetadataConnector. you advised "It's certainly possible to add the
>> current job's start time field as hard-wired metadata" Please let me know
>> how to achieve it.
>>
>> Thanks,
>> Jitu
>>
>> On Fri, Dec 19, 2014 at 1:09 PM, Karl Wright <daddywri@gmail.com> wrote:
>>
>>> Hi Jitu,
>>>
>>> You can certainly add a unique string associated with a job to every
>>> document using the Metadata Adjuster transformation connector (which of
>>> course can be the job name).  The time of indexing is already sent as a
>>> metadata field (can't remember which one off the top of my head, but I'm
>>> sure you can find it).  What you can't get, mainly because it basically has
>>> little meaning in MCF, is the time the job was started.  It's certainly
>>> possible to add the current job's start time field as hard-wired metadata,
>>> but I bet your client would prefer the actual time of indexing of the
>>> document anyhow.
>>>
>>> Thanks,
>>> Karl
>>>
>>>
>>> On Fri, Dec 19, 2014 at 2:30 AM, Jitu <abjitu@gmail.com> wrote:
>>>>
>>>> Hi Karl,
>>>>             Thanks for all your support. For one of our customer they
>>>> need job scheduled information to be sent as part of output connector.
>>>> Basically my customer wants to know what all files are indexed in one job
>>>> run using solr search.
>>>>
>>>> For example if my job ran on 17th dec 2014 at 11:23 AM then i will send
>>>> a unique string say "JobName 17-12-2014 11:23" as part of file
>>>> metadata to solr output connector. During solr search it will use this
>>>> string to search what all files are indexed as part of this string or job
>>>> run.
>>>>
>>>> Please correct me if i am wrong or suggest me how to achive it.
>>>>
>>>> Thanks,
>>>> Jitu
>>>>
>>>
>>
>

Mime
View raw message