快捷搜索: 王者荣耀 脱发

python爬虫实现对年级网站通知的自动化推送

前言

原理

函数库

import time
import datetime
import urllib.request
import requests
import json
from bs4 import BeautifulSoup
import smtplib
from email import (header)
from email.mime import (text, multipart)

安装方法

pip install --upgrade pip
pip install beautifulsoup4
pip install requests

函数

1.模拟浏览器向网站发出请求并加载资源到本地

def getTitle (url):
    headers = (User-Agent,"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.114 Safari/537.36")
    opener = urllib.request.build_opener()
    opener.addheaders = [headers]
    urllib.request.install_opener(opener)
    html = urllib.request.urlopen(url).read().decode(utf-8, ignore)
    bs = BeautifulSoup(html,html.parser)
    Title_links = bs.select(特定标签)
    return Title_links

2.获取当前日期

def getNowDate():
    now_time = datetime.datetime.now()
    yes_time = now_time+datetime.timedelta(days=-3)
    current_time = yes_time.strftime(%Y-%m-%d)
    return current_time

3.对所筛选出的数据进行整合

for link in linklist_Title:
    contents.append(link.text.strip())
    links.append(link.get(href))

for date in linklist_Date:
    dates.append(date.text.strip())
#获取指定日期的文章信息
for date,text, link, in zip(dates, contents, links):
    data = date+ +text+:http://xxx.xxx.com+link
    if date == Now_Date:
        send_data = send_data+data+

4.群发邮件

5.用json格式向push+推送文章

token = 4bxxxxxxxxxxxxxxxxxxxxxxx5
title= 今日级网更新通知
content = send_data 
url = http://pushplus.hxtrip.com/send
data = {
          
   
    "token":token,
    "title":title,
    "content":content
}
body=json.dumps(data).encode(encoding=UTF-8)
headers = {
          
   Content-Type:application/json}
requests.post(url,data=body,headers=headers)

源码

gihub仓库:

云服务器定时计划

定时执行shell指令

/usr/bin/python /www/server/panel/class/Notice_Spider.py

博客:

效果

经验分享 程序员 微信小程序 职场和发展