Python的HTML解析 mechanize
<p>当您希望与 Web 页面中找到的内容进行某种比较复杂的交互时,您需要使用 <strong>mechanize</strong> 库</p> <p>示例代码:</p> <pre class="brush:python; toolbar: true; auto-links: false;">import re from mechanize import Browser br = Browser() br.open("http://www.example.com/") # follow second link with element text matching regular expression response1 = br.follow_link(text_regex=r"cheese\s*shop", nr=1) assert br.viewing_html() print br.title() print response1.geturl() print response1.info() # headers print response1.read() # body response1.close() # (shown for clarity; in fact Browser does this for you) br.select_form(name="order") # Browser passes through unknown attributes (including methods) # to the selected HTMLForm (from ClientForm). br["cheeses"] = ["mozzarella", "caerphilly"] # (the method here is __setitem__) response2 = br.submit() # submit current form # print currently selected form (don't call .submit() on this, use br.submit()) print br.form</pre> <p><strong>项目主页:</strong><a href="http://www.open-open.com/lib/view/home/1324371010780" target="_blank">http://www.open-open.com/lib/view/home/1324371010780</a></p> <p></p>
本文由用户 jopen 自行上传分享,仅供网友学习交流。所有权归原作者,若您的权利被侵害,请联系管理员。
转载本站原创文章,请注明出处,并保留原始链接、图片水印。
本站是一个以用户分享为主的开源技术平台,欢迎各类分享!