Web 爬虫：scrape

scrape 是一个使用 Go 语言开发的简单高级Web 爬虫。

示例代码：

package main     import (      "fmt"      "net/http"         "github.com/yhat/scrape"      "golang.org/x/net/html"      "golang.org/x/net/html/atom"  )     func main() {      // request and parse the front page      resp, err := http.Get("https://news.ycombinator.com/")      if err != nil {          panic(err)      }      root, err := html.Parse(resp.Body)      if err != nil {          panic(err)      }         // define a matcher      matcher := func(n *html.Node) bool {          // must check for nil values          if n.DataAtom == atom.A && n.Parent != nil && n.Parent.Parent != nil {              return scrape.Attr(n.Parent.Parent, "class") == "athing"          }          return false      }      // grab all articles and print them      articles := scrape.FindAll(root, matcher)      for i, article := range articles {          fmt.Printf("%2d %s (%s)\n", i, scrape.Text(article), scrape.Attr(article, "href"))      }  }

项目主页：http://www.open-open.com/lib/view/home/1432522312785

本文由用户 jopen 自行上传分享，仅供网友学习交流。所有权归原作者，若您的权利被侵害，请联系管理员。

转载本站原创文章，请注明出处，并保留原始链接、图片水印。

本站是一个以用户分享为主的开源技术平台，欢迎各类分享！

本文地址：https://www.open-open.com/lib/view/open1432522312785.html

scrape 网络爬虫

热门搜索

Web 爬虫：scrape