nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrzej Bialecki>
Subject Re: CrawlDatum.metaData should never be null
Date Tue, 25 Apr 2006 21:34:01 GMT
Stefan Groschupf wrote:
> Hi Andrzej,
> this specially requested by Doug to do not instantiate the object by 
> default since this consume to much resources.
> So I changed this in the way it works today.

Hmm.. I understand his point. But it means that I have to always put "if 
(datum.getMetaData() == null)" check, which pollutes the code in all 
places that deal with metadata. Currently this is just CrawlDbReducer 
(but it already looks ugly there), but it will be like that in any place 
that wants to use metadata.

If that's really such a big concern, then perhaps we should also set 
ParseData.contentMeta and parseMeta to null, as well as Content.metadata ...

or perhaps the CrawlDatum.getMetaData() should instantiate it, this way 
if you don't call the getter you won't get any allocation.

Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration  Contact: info at sigram dot com

View raw message