Html文档解析器 HtmlCleaner

     <a href="/misc/goto?guid=4959499543027097061"> <img border="0" alt="Html文档解析器 HtmlCleaner" src="https://simg.open-open.com/show/8f7be556bdf74662b8e12a915d9deeb6.jpg" width="198" height="53" /> </a>    <br /> HtmlCleaner是一个开源的Java语言的Html文档解析器。HtmlCleaner能够重新整理HTML文档的每个元素并生成结构良好 (Well-Formed)的 HTML 文档。默认它遵循的规则是类似于大部份web浏览器为创文档对象模型所使用的规则。然而，用户可以提供自定义tag和规则组来进行过滤和匹配。    <br />    <h3>功能特性：</h3>    <ul>     <li>HtmlCleaner parses input HTML and generates tree-structure suitable for programmatic manipulation.</li>     <li>Serializers are responsible for outputting the DOM structure to XML, HTML, DOM or JDom.</li>     <li>Parsing phase relies on tag descriptions which can be customized by the user.</li>     <li>HtmlClaner's behaviour can be configured through number of parameters.</li>     <li>HtmlClaner is thread safe, meaning that single instance can clean multiple html sources at the same time.</li>     <li>HtmlClaner can be used from Java code, from command line or as Ant task.</li>     <li>HtmlClaner requires JRE 1.5+.</li>    </ul>    <p><strong>项目主页：</strong><a href="http://www.open-open.com/lib/view/home/1324371733999" target="_blank">http://www.open-open.com/lib/view/home/1324371733999</a></p>

本文由用户 jopen 自行上传分享，仅供网友学习交流。所有权归原作者，若您的权利被侵害，请联系管理员。

转载本站原创文章，请注明出处，并保留原始链接、图片水印。

本站是一个以用户分享为主的开源技术平台，欢迎各类分享！

本文地址：https://www.open-open.com/lib/view/open1324371733999.html

Java HTML操作类库

热门搜索

Html文档解析器 HtmlCleaner