Project Description
This is a web crawler program written in C#.

Features:
  • Configuable: thread count, waiting time, connection timeout, allow MIME types and priorities, download folders.
  • Statstics information: URL count, total downloaded files, total downloaded bytes, CPU utility and available memory.
  • Preferential crawler: user can set priority for MIME types (high, above, normal, below, low).
  • Robust: 10+ URL normalization rules, crawler trap avoiding rules.

Screenshot:
crawler.png

Last edited Jan 4, 2010 at 5:08 PM by foamliu, version 5