Java开发的整个网站下载工具,JoBo
JoBo是一个用于下载整个Web站点的简单工具。它本质是一个 Web Spider。与其它下载工具相比较它的主要优势是能够自动填充form(如:自动登录)和使用cookies来处理session。JoBo还有灵活的 下载规则(如:通过网页的URL,大小,MIME类型等)来限制下载。
特性
- command line and graphical version (but command line version needs a major update, currently the GUI version has much more features)
- recursive search of all documents starting from a given start document
- support of tags (with fault tolerance)
- support of the robot exclusion protocol
- user controlled maximal search depth
- user agent name can be defined
- support of referrer headers
- support of automated form handling (JoBo can fill fields with predefined values)
- cookie support
- XML configuration
- used bandwidth can be limited
- allow/deny downloads by mime type and document size (e.g. ignore all image/* files)
- allow/deny downloads by regular expressions (e.g. don't download /cgi-bin)
- can convert absolute links to relative
- download only files newer then a given age
- resume job
项目主页:
http://www.open-open.com/lib/view/home/1349861484181</p> </div> 本文由用户 jopen 自行上传分享,仅供网友学习交流。所有权归原作者,若您的权利被侵害,请联系管理员。
转载本站原创文章,请注明出处,并保留原始链接、图片水印。
本站是一个以用户分享为主的开源技术平台,欢迎各类分享!