nutch-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anton Potehin" <an...@orbita1.ru>
Subject question about crawldb
Date Tue, 18 Apr 2006 11:36:11 GMT
1.	We have found these flags in CrawlDatum class: 

  public static final byte STATUS_SIGNATURE = 0;

  public static final byte STATUS_DB_UNFETCHED = 1;

  public static final byte STATUS_DB_FETCHED = 2;

  public static final byte STATUS_DB_GONE = 3;

  public static final byte STATUS_LINKED = 4;

  public static final byte STATUS_FETCH_SUCCESS = 5;

  public static final byte STATUS_FETCH_RETRY = 6;

  public static final byte STATUS_FETCH_GONE = 7;

Though the names of these flags describe their aims, it is not clear
completely what they mean and what is the difference between
STATUS_DB_FETCHED and STATUS_FETCH_SUCCESS for example.

 

 

2.	Where new links are being added into CrawlDB? 

 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message