网站建设人员工作计划家里装修
示例代码说明:
在小说网站选定一本小说,将小说每个章节内容存为txt文档,文件标题与小说章节标题一致
import requests
from lxml import etree
#一本小说链接
Anovellink = 'https://www.hongxiu.com/book/18899519001291804#Catalog'
#目录页代码
ContentsPageCode = requests.get(Anovellink).text
#目录页
ContentsPage = etree.HTML(ContentsPageCode)
href = ContentsPage.xpath('//*[@id="j-catalogWrap"]/div[2]/div/ul/li/a/@href')
for link in href:#链接地址linkaddress = 'https://www.hongxiu.com' + link#章节页面代码Chapterpagecode=requests.get(linkaddress).text#章节页面Chapterpage = etree.HTML(Chapterpagecode)#文字列表Literallist =Chapterpage.xpath('//div[@class="ywskythunderfont"]/p/text()')#标题title=Chapterpage.xpath('//h1[@class ="j_chapterName"]/text()')[0]file =open('E:/novelpython/'+title+ '.txt','w',encoding='utf-8')for paragraph in Literallist:file.write(paragraph + '\n')print(title +' Chapter crawling is complete')
print('The novel pulling is complete')
结果示例:


