HTML文档解析器 HTMLParser

     <p><img style="width:95px;height:70px;" title="htmlparserlogo.jpg" border="0" alt="htmlparserlogo.jpg" src="https://simg.open-open.com/show/e0b5404848e225b79cdc40bd85420ec0.jpg" width="157" height="132" /><br /> HTML Parser 是一个对HTML进行分析的快速实时的解析器，最新的发行版本是1.6，另外2.0的开发版本已经两年没有进展了。利用它实现：</p>    <ul>     <li>URL rewriting, modifying some or all links on a page</li>     <li>site capture, moving content from the web to local disk</li>     <li>censorship, removing offending words and phrases from pages</li>     <li>HTML cleanup, correcting erroneous pages</li>     <li>ad removal, excising URLs referencing advertising</li>     <li>conversion to XML, moving existing web pages to XML</li>    </ul>    <p>示例代码：<br /> </p>    <pre class="brush:java; toolbar: true; auto-links: false;"> Parser parser = new Parser ("http://whatever");  NodeList list = parser.parse (null);  Node node = list.elementAt (0);  NodeList sublist = node.getChildren ();  System.out.println (sublist.size ());</pre>    <br />    <p><strong>项目主页：</strong><a href="http://www.open-open.com/lib/view/home/1324372600983" target="_blank">http://www.open-open.com/lib/view/home/1324372600983</a></p>    <p></p>

本文由用户 jopen 自行上传分享，仅供网友学习交流。所有权归原作者，若您的权利被侵害，请联系管理员。

转载本站原创文章，请注明出处，并保留原始链接、图片水印。

本站是一个以用户分享为主的开源技术平台，欢迎各类分享！

本文地址：https://www.open-open.com/lib/view/open1324372600983.html

Java HTML操作类库

热门搜索

HTML文档解析器 HTMLParser