你想收到Solidot,IT之家,抽屉·挨踢1024和Linux中国的每日文章汇总邮件么

2020年03月19日 2633点热度 1人点赞 1条评论

今天来给大家分享4个python脚本,分别是定时抓取Solidot,IT之家,Linux中国和抽屉·挨踢1024这四个媒体的rss链接然后定时发送汇总邮件。

注意事项:

  1. 部署采用腾讯云函数,部署方式具体参考本站这篇文章
  2. 在采用腾讯云函数部署设置定时触发器时,solidot建议设置在22点左右,因为它一般晚上九点多久不更新了;IT之家建议设置在23:59因为之家基本全天24小时都在更新,这样设置的话即使早睡也可以在第二天早上起来查看邮件;Linux中国也可以设置在22点左右,因为他一般下午三四点更新三五篇文章;抽屉·挨踢1024同理,我一般习惯晚上看
  3. 下面Linux中国的脚本在参考上面的部署文章进行pip install时要多加一个user-agent
  4. 抽屉1024的代码在腾讯云函数部署的时候一定要把位置选择为海外的地方,比如硅谷,因为采用了rsshub的rss链接,而rsshub貌似已经需要富强,所以你懂的

solidot

其实本站的这篇实战腾讯云函数的文章就是实战的solidot,只不过那里面的代码缺少一个检测是否是当天发出的文章的功能,由于改动较大就再把新的代码贴到下面一次,否则我就让大家直接去复制粘贴那篇文章中的代码了。

#!/usr/bin/env python3
# coding=utf-8

import re
import time
import smtplib
import requests
import datetime
from bs4 import BeautifulSoup
from email.mime.text import MIMEText

HOST = 'smtp.126.com'
PORT = 25
SENDER = '@126.com'
RECEIVER = '@qq.com'
PWD = ''

current_time = time.strftime("%Y-%m-%d", time.localtime())


def english_time_to_num(time_str):
    result = re.search(r'(\d+ \w+ \d+)', time_str).group(1)
    time_format = datetime.datetime.strptime(result, '%d %b %Y')
    time_format = time_format.strftime('%Y-%m-%d')
    return time_format


def mail_send(subject, mail_body):
    try:
        msg = MIMEText(mail_body, 'plain', 'utf-8')
        msg['Subject'] = subject
        msg['From'] = SENDER
        msg['To'] = RECEIVER
        s = smtplib.SMTP(HOST, PORT)
        s.debuglevel = 0
        s.login(SENDER, PWD)
        s.sendmail(SENDER, RECEIVER, msg.as_string())
        s.quit()
    except smtplib.SMTPException as e:
        print(str(e))
        exit(1)


def get_soup():
    url = 'https://www.solidot.org/index.rss'
    rss_xml = requests.get(url).text
    soup = BeautifulSoup(rss_xml, 'xml')
    return soup


def get_mail_body():
    contents = get_soup().select('item')
    contents_list = []
    for c in contents:
        pub_date = c.select_one('pubDate').get_text()
        pub_date_to_num = english_time_to_num(pub_date)
        if pub_date_to_num == current_time:
            title = c.select_one('title').get_text()
            link = c.select_one('link').get_text()
            contents_list.append(title + '\n' + link)
    return '\n'.join(contents_list)


def main(arg1, arg2):
    mail_send(subject=current_time + ' Solidot今日文章',
              mail_body=get_mail_body())
    print('成功发送了一封邮件!')

 

IT之家

把上面的solidot的代码中的url和发邮件的主题字符串替换一下就好了,因为这两家的rss都是同样的atom协议,所以代码可以复用

具体如下:

把下面的url替换为https://www.ithome.com/rss/

def get_soup():
    url = 'https://www.solidot.org/index.rss'

把下面的邮件标题替换' IT之家今日文章'

def main(arg1, arg2):
    mail_send(subject=current_time + ' Solidot今日文章',
              mail_body=get_mail_body())
    print('成功发送了一封邮件!')

抽屉·挨踢1024

和上面的IT之家一样,把上面的solidot的代码中的url和发邮件的主题字符串替换一下就好了,因为这两家的rss都是同样的atom协议,所以代码可以复用

具体如下:

把下面的url替换为https://rsshub.app/chouti/tec

def get_soup():
    url = 'https://www.solidot.org/index.rss'

把下面的邮件标题替换'抽屉挨踢1024今日文章'

def main(arg1, arg2):
    mail_send(subject=current_time + ' Solidot今日文章',
              mail_body=get_mail_body())
    print('成功发送了一封邮件!')

 

Linux中国

因为Linux中国的要设置http header修改的地方比较多,我就不说怎么修改代码,而是直接把完整的代码贴在下面了

#!/usr/bin/env python3
# coding=utf-8

import re
import time
import smtplib
import requests
import datetime
from bs4 import BeautifulSoup
from email.mime.text import MIMEText
from user_agent import generate_user_agent

HOST = 'smtp.126.com'
PORT = 25
SENDER = '@126.com'
RECEIVER = '@qq.com'
PWD = ''
HEADERS = {
    'accept': "text/html,application/xhtml+xml,application/xml"
              ";q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3",
    'accept-language': 'zh-CN,zh;q=0.9,en-US;q=0.8,en;q=0.7',
    'upgrade-insecure-requests': '1',
    'accept-encoding': 'gzip, deflate, br',
    'user-agent': generate_user_agent(os='win')}

current_time = time.strftime("%Y-%m-%d", time.localtime())


def english_time_to_num(time_str):
    result = re.search(r'(\d+ \w+ \d+)', time_str).group(1)
    time_format = datetime.datetime.strptime(result, '%d %b %Y')
    time_format = time_format.strftime('%Y-%m-%d')
    return time_format


def mail_send(subject, mail_body):
    try:
        msg = MIMEText(mail_body, 'plain', 'utf-8')
        msg['Subject'] = subject
        msg['From'] = SENDER
        msg['To'] = RECEIVER
        s = smtplib.SMTP(HOST, PORT)
        s.debuglevel = 0
        s.login(SENDER, PWD)
        s.sendmail(SENDER, RECEIVER, msg.as_string())
        s.quit()
    except smtplib.SMTPException as e:
        print(str(e))
        exit(1)


def get_soup():
    url = 'https://linux.cn/rss.xml'
    rss_xml = requests.get(url, headers=HEADERS).text
    soup = BeautifulSoup(rss_xml, 'xml')
    return soup


def get_mail_body():
    contents = get_soup().select('item')
    contents_list = []
    for c in contents:
        pub_date = c.select_one('pubDate').get_text()
        pub_date_to_num = english_time_to_num(pub_date)
        if pub_date_to_num == current_time:
            title = c.select_one('title').get_text()
            link = c.select_one('link').get_text()
            contents_list.append(title + '\n' + link)
    return '\n'.join(contents_list)


def main(arg1, arg2):
    mail_send(subject=current_time + ' Linux中国今日文章',
              mail_body=get_mail_body())
    print('成功发送了一封邮件!')

下面是Linux中国的脚本在调试过程中的截图:

最后强调一下这些代码如何部署,一定要参考本站的这篇文章

 

Sharp

"A Linux user and a Python{}".format('er')

文章评论

  • cici

    真的可以耶~

    2020年03月20日