.contents 和 .children
tag的 .contents
属性可以将tag的子节点以列表的方式输出:
1 head_tag = soup.head 2 head_tag 3 #The Dormouse's story 4 5 head_tag.contents 6 [The Dormouse's story ] 7 8 title_tag = head_tag.contents[0] 9 title_tag 10 #The Dormouse's story 11 title_tag.contents 12 # [u'The Dormouse's story']
BeautifulSoup
对象本身一定会包含子节点,也就是说标签也是 BeautifulSoup
对象的子节点:
len(soup.contents)
# 1
soup.contents[0].name
# u'html'
字符串没有 .contents
属性,因为字符串没有子节点:
text = title_tag.contents[0]
text.contents
# AttributeError: 'NavigableString' object has no attribute 'contents'
通过tag的 .children
生成器,可以对tag的子节点进行循环:
for child in title_tag.children:
print(child)
# The Dormouse's story