Java开源Web数据抽取工具: Web-Harvest
Web-Harvest是一个Java开源Web数据抽取工具。它能够收集指定的Web页面并从这些页面中提取有用的数据。Web-Harvest主要是运用了像XSLT,XQuery,正则表达式等这些技术来实现对text/xml的操作。
1. Welcome screen with quick links
![Java开源Web数据抽取工具: Web-Harvest](https://simg.open-open.com/show/8c81ac5784a2f1bc8f58ab2d8fd12407.jpg)
2. Web-Harvest XML editing with auto-completion support (Ctrl + Space)
![Java开源Web数据抽取工具: Web-Harvest](https://simg.open-open.com/show/f1f88830accc36b3a55d8c2cc04ec464.jpg)
3. Defining initial variables that are pushed to the Web-Harvest context before execution starts
![Java开源Web数据抽取工具: Web-Harvest](https://simg.open-open.com/show/875fe2125f3f33321a970a9ce72a0e5e.jpg)
4. Settings dialog
![Java开源Web数据抽取工具: Web-Harvest](https://simg.open-open.com/show/2e33e2eab6e7c389adf65c90d04ed4c2.jpg)
5. Viewing execution result as XML and testing XPath expression agains it
![Java开源Web数据抽取工具: Web-Harvest](https://simg.open-open.com/show/850dc732379b3dd9f4113391c128caf2.jpg)
6. Viewing download images while execution in progress
![Java开源Web数据抽取工具: Web-Harvest](https://simg.open-open.com/show/8ae81631e3c5a04800bbb0435cd55982.jpg)
7. Checking attributes of HTTP execution
![Java开源Web数据抽取工具: Web-Harvest](https://simg.open-open.com/show/d8b31392267c008b04aed6bf41b999ea.jpg)
8. Debugging
![Java开源Web数据抽取工具: Web-Harvest](https://simg.open-open.com/show/be34600a2e4f9e7a336151fa6e2ec848.jpg)
本文由用户 jopen 自行上传分享,仅供网友学习交流。所有权归原作者,若您的权利被侵害,请联系管理员。
转载本站原创文章,请注明出处,并保留原始链接、图片水印。
本站是一个以用户分享为主的开源技术平台,欢迎各类分享!