吴菲的作业二

代码


  from bs4 import BeautifulSoup
  import requests
  x = requests.get('http://www.jxufe.edu.cn/')
  txt = x.text
  soup = BeautifulSoup(txt)
  txt_text = soup.get_text()
  import re
  y = re.sub('\n+', '\n',txt_text)
  print(y)

结果

结果截图

解释

运用requests得到网页代码,利用BeautifulSoup提取文本,再引入正则表达式将多余的空行删去。