nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrzej Bialecki>
Subject CrawlDatum.metaData should never be null
Date Tue, 25 Apr 2006 19:40:35 GMT

Per subject, I think it should follow the same pattern as other metadata 
maps in ParseData and Content. Currently when we allocate new 
CrawlDatum, metaData is null, which complicates the logic in all places 
that want to handle metaData.

When CrawlDatum is serialized, we already check if metaData.size() > 0, 
and if not then nothing is written out. So, it doesn't make much sense 
to use null here - savings on the object creation are also minimal.

If there are no objections, I'll make the change to always allocate 
metaData = new MapWritable(), whenever we create CrawlDatum.

Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration  Contact: info at sigram dot com

View raw message