【jsoup】html解析
Java HTML Parser
字符串解析为xml文档,作用输入是什么样子的片断,输出业务什么样子的
Document doc = Jsoup.parse(html, "", Parser.xmlParser()); System.out.println(doc.html());
片断
hello
Document doc = Jsoup.parse(html, "", Parser.xmlParser());结果helloDocument doc = Jsoup.parse(html);结果hello
字符串解析为文档
String html = "First html parse Parsed HTML into a doc.
"; Document doc = Jsoup.parse(html); System.out.println(doc.html());
字符串解析为片断
String html = "Lorem ipsum.
"; Document doc = Jsoup.parseBodyFragment(html); Element body = doc.body(); System.out.println(body.html());从url加载文档
Document doc = Jsoup.connect("http://www.lianhu.gov.cn/").get(); String title = doc.title(); System.out.println(title); 构建特殊请求 Document doc = Jsoup.connect("http://www.lianhu.gov.cn/") .data("query", "Java") .userAgent("Mozilla") .cookie("auth", "token") .timeout(3000) .post();从文件加载文档
File input = new File("D:/deya/vhost/zizhou/index.html"); Document doc = Jsoup.parse(input, "UTF-8", "http://example.com/"); System.out.println(doc.html());