uima-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "reshu.agarwal" <reshu.agar...@orkash.com>
Subject Re: status Lost=1 in DUCC
Date Fri, 28 Mar 2014 12:35:49 GMT
On 03/28/2014 05:54 PM, Lou DeGenaro wrote:
> Hi Reshu,
>
> Very good.  It would be helpful if you could supply a small sample data
> comprising "invalid XML characters" as a test case, to motivate DUCC to
> detect and handle this situation more elegantly in terms of allowing the
> user to recognize what's wrong.
>
> Lou.
>
>
> On Fri, Mar 28, 2014 at 12:00 AM, reshu.agarwal <reshu.agarwal@orkash.com>wrote:
>
>> On 03/27/2014 08:13 PM, Lou DeGenaro wrote:
>>
>>> he data being sent are "values" rather than "keys" in your
>>> CAS?  If so, this is not really a "best practice" for DUCC use.
>>>
>> Hi Lou,
>>
>> This is not the problem of how I send the data. My document contains some
>> invalid XML characters. So, problem resolved after I applied filter for
>> that.
>>
>> Reshu.
>>
Ya Sure,

Here is a sample document:

"About the Human Rights House Network ( www.humanrightshouse.org ) The 
Human Rights House Network (HRHN) unites 87 human rights NGOs joining 
forces in 18 independent Human Rights Houses in 15 countries in Western 
Balkans, Eastern Europe and South Caucasus, East and Horn of Africa, and 
Western Europe. HRHN???s aim is to protect, empower and support human 
rights organisations locally and unite them in an international network 
of Human Rights Houses. The Human Rights House Foundation (HRHF), based 
in Oslo (Norway) with an office in Geneva (Switzerland), is HRHN???s 
secretariat. HRHF is international partner of the South Caucasus Network 
of Human Rights Defenders and the emerging Balkan Network of Human 
Rights Defenders. HRHF has consultative status with the United Nations 
and HRHN has participatory status with the Council of Europe.
All applicants are requested to e-mail a motivation letter and 
curriculum vitae to: Anna Innocenti, International Advocacy Officer at 
the Human Rights House Foundation (HRHF), at 
ae;e;a.innf;centi@humae;rightshouse.f;rg ."


Specific this line contains some invalid characters:

"All applicants are requested to e-mail a motivation letter and 
curriculum vitae to: Anna Innocenti, International Advocacy Officer at 
the Human Rights House Foundation (HRHF), at 
ae;e;a.innf;centi@humae;rightshouse.f;rg .""

And we can find out the problem by trying the same document in UIMA AS. 
And this problem of invalid character was also in object other then 
document text which is passed in CAS.

-- 
Thanks,
Reshu Agarwal


Mime
View raw message