钟鸣宇的作业一

代码


from bs4 import BeautifulSoup
import requests
import re
r = requests.get('http://www.jxufe.edu.cn')
html = r.text
f = open('jxufeedu.html','w',encoding='utf-8')
soup = BeautifulSoup(html)
text = soup.get_text()
text = re.sub("\n"," ",text)
print(text)

结果

结果截图

解释

首先导入BeautifulSoup,requests,re模块,随后获取江财官网源代码并写入text,再用BeautifulSoup获得text内容,最后用re模块将将多个换行符替换为一个换行符,最后print出text。