解决python ThreadPoolExecutor 线程池中的异常捕获问题


Posted in Python onApril 08, 2020

问题

最近写了涉及线程池及线程的 python 脚本,运行过程中发现一个有趣的现象,线程池中的工作线程出现问题,引发了异常,但是主线程没有捕获异常,还在发现 BUG 之前一度以为线程池代码正常返回。

先说重点

这里主要想介绍 python concurrent.futuresthread.ThreadPoolExecutor 线程池中的 worker 引发异常的时候,并不会直接向上抛起异常,而是需要主线程通过调用concurrent.futures.Future.exception(timeout=None) 方法主动获取 worker 的异常。

问题重现及解决

引子

问题主要由这样一段代码引起的:

def thread_executor():
 logger.info("I am slave. I am working. I am going to sleep 3s")
 sleep(3)
 logger.info("Exit thread executor")


def main():
 thread_obj = threading.Thread(target=thread_executor)
 while True:
  logger.info("Master starts thread worker")

  try:
   # 工作线程由于某种异常而结束并退出了,想重启工作线程的工作,但又不想重复创建线程
   thread_obj.start() # 这一行会报错,同一线程不能重复启动
  except Exception as e:
   logger.error("Master start thread error", exc_info=True)
   raise e

  logger.info("Master is going to sleep 5s")
  sleep(5)

上面这段代码的功能如注释中解释的,主要要实现类似生产者消费者的功能,工作线程一直去生产资源,主线程去消费工作线程生产的资源。但是工作线程由于异常推出了,想重新启动生产工作。显然,这个代码会报错。

运行结果:

thread: MainThread [INFO] Master starts thread worker
thread: Thread-1 [INFO] I am slave. I am working. I am going to sleep 3s
thread: MainThread [INFO] Master is going to sleep 5s
thread: Thread-1 [INFO] Exit thread executor because of some exception
thread: MainThread [INFO] Master starts thread worker
thread: MainThread [ERROR] Master start thread error
Traceback (most recent call last):
File "xxx.py", line 47, in main
 thread_obj.start()
File "E:\anaconda\lib\threading.py", line 843, in start
 raise RuntimeError("threads can only be started once")
RuntimeError: threads can only be started once
Traceback (most recent call last):
File "xxx.py", line 56, in <module>
 main()
File "xxx.py", line 50, in main
 raise e
File "xxx.py", line 47, in main
 thread_obj.start()
File "E:\anaconda\lib\threading.py", line 843, in start
 raise RuntimeError("threads can only be started once")
RuntimeError: threads can only be started once

切入正题

然而脚本还有其他业务代码要运行,所以需要把上面的资源生产和消费的代码放到一个线程里完成,所以引入线程池来执行这段代码:

def thread_executor():
 while True:
  logger.info("I am slave. I am working. I am going to sleep 3s")
  sleep(3)
  logger.info("Exit thread executor because of some exception")
  break


def main():
 thread_obj = threading.Thread(target=thread_executor)
 while True:
  logger.info("Master starts thread worker")

  # 工作线程由于某种异常而结束并退出了,想重启工作线程的工作,但又不想重复创建线程
  # 没有想到这里会有异常
  thread_obj.start() # 这一行会报错,同一线程不能重复启动

  logger.info("Master is going to sleep 5s")
  sleep(5)


def thread_pool_main():
 thread_obj = ThreadPoolExecutor(max_workers=1, thread_name_prefix="WorkExecutor")
 logger.info("Master ThreadPool Executor starts thread worker")
 thread_obj.submit(main)

 while True:
  logger.info("Master ThreadPool Executor is going to sleep 5s")
  sleep(5)

if __name__ == '__main__':
 thread_pool_main()

代码运行结果如下:

INFO [thread: MainThread] Master ThreadPool Executor starts thread worker
INFO [thread: WorkExecutor_0] Master starts thread worker
INFO [thread: MainThread] Master ThreadPool Executor is going to sleep 5s
INFO [thread: Thread-1] I am slave. I am working. I am going to sleep 3s
INFO [thread: WorkExecutor_0] Master is going to sleep 5s
INFO [thread: Thread-1] Exit thread executor because of some exception
INFO [thread: MainThread] Master ThreadPool Executor is going to sleep 5s
INFO [thread: WorkExecutor_0] Master starts thread worker
INFO [thread: MainThread] Master ThreadPool Executor is going to sleep 5s
INFO [thread: MainThread] Master ThreadPool Executor is going to sleep 5s
INFO [thread: MainThread] Master ThreadPool Executor is going to sleep 5s
INFO [thread: MainThread] Master ThreadPool Executor is going to sleep 5s
INFO [thread: MainThread] Master ThreadPool Executor is going to sleep 5s
INFO [thread: MainThread] Master ThreadPool Executor is going to sleep 5s
INFO [thread: MainThread] Master ThreadPool Executor is going to sleep 5s

... ...

显然,由上面的结果,在线程池 worker 执行到 INFO [thread: WorkExecutor_0] Master starts thread worker 的时候,是会有异常产生的,但是整个代码并没有抛弃任何异常。

解决方法

发现上面的 bug 后,想在线程池 worker 出错的时候,把异常记录到日志。查阅资料,要获取线程池的异常信息,需要调用 concurrent.futures.Future.exception(timeout=None) 方法,为了记录日志,这里加了线程池执行结束的回调函数。同时,日志中记录异常信息,用了 logging.exception() 方法。

def thread_executor():
 while True:
  logger.info("I am slave. I am working. I am going to sleep 3s")
  sleep(3)
  logger.info("Exit thread executor because of some exception")
  break


def main():
 thread_obj = threading.Thread(target=thread_executor)
 while True:
  logger.info("Master starts thread worker")

  # 工作线程由于某种异常而结束并退出了,想重启工作线程的工作,但又不想重复创建线程
  # 没有想到这里会有异常
  thread_obj.start() # 这一行会报错,同一线程不能重复启动

  logger.info("Master is going to sleep 5s")
  sleep(5)


def thread_pool_callback(worker):
 logger.info("called thread pool executor callback function")
 worker_exception = worker.exception()
 if worker_exception:
  logger.exception("Worker return exception: {}".format(worker_exception))


def thread_pool_main():
 thread_obj = ThreadPoolExecutor(max_workers=1, thread_name_prefix="WorkExecutor")
 logger.info("Master ThreadPool Executor starts thread worker")
 thread_pool_exc = thread_obj.submit(main)
 thread_pool_exc.add_done_callback(thread_pool_callback)
 # logger.info("thread pool exception: {}".format(thread_pool_exc.exception()))

 while True:
  logger.info("Master ThreadPool Executor is going to sleep 5s")
  sleep(5)


if __name__ == '__main__':
 thread_pool_main()

代码运行结果:

INFO [thread: MainThread] Master ThreadPool Executor starts thread worker
INFO [thread: WorkExecutor_0] Master starts thread worker
INFO [thread: MainThread] Master ThreadPool Executor is going to sleep 5s
INFO [thread: Thread-1] I am slave. I am working. I am going to sleep 3s
INFO [thread: WorkExecutor_0] Master is going to sleep 5s
INFO [thread: Thread-1] Exit thread executor because of some exception
INFO [thread: MainThread] Master ThreadPool Executor is going to sleep 5s
INFO [thread: WorkExecutor_0] Master starts thread worker
INFO [thread: WorkExecutor_0] called thread pool executor callback function
ERROR [thread: WorkExecutor_0] Worker return exception: threads can only be started once
Traceback (most recent call last):
File "E:\anaconda\lib\concurrent\futures\thread.py", line 57, in run
 result = self.fn(*self.args, **self.kwargs)
File "xxxx.py", line 46, in main
 thread_obj.start() # 这一行会报错,同一线程不能重复启动
File "E:\anaconda\lib\threading.py", line 843, in start
 raise RuntimeError("threads can only be started once")
RuntimeError: threads can only be started once
INFO [thread: MainThread] Master ThreadPool Executor is going to sleep 5s
INFO [thread: MainThread] Master ThreadPool Executor is going to sleep 5s
INFO [thread: MainThread] Master ThreadPool Executor is going to sleep 5s
... ...

最终的写法

其实,上面写法中,想重复利用一个线程去实现生产者线程的实现方法是有问题的,在此处,一般情况下,线程执行结束后,线程资源会被会被操作系统,所以线程不能被重复调用 start() 。

解决python ThreadPoolExecutor 线程池中的异常捕获问题

解决python ThreadPoolExecutor 线程池中的异常捕获问题

一种可行的实现方式就是,用线程池替代。当然,这样做得注意上面提到的线程池执行体的异常捕获问题。

def thread_executor():
 while True:
  logger.info("I am slave. I am working. I am going to sleep 3s")
  sleep(3)
  logger.info("Exit thread executor because of some exception")
  break

def executor_callback(worker):
 logger.info("called worker callback function")
 worker_exception = worker.exception()
 if worker_exception:
  logger.exception("Worker return exception: {}".format(worker_exception))
  # raise worker_exception


def main():
 slave_thread_pool = ThreadPoolExecutor(max_workers=1, thread_name_prefix="SlaveExecutor")
 restart_flag = False
 while True:
  logger.info("Master starts thread worker")

  if not restart_flag:
   restart_flag = not restart_flag
   logger.info("Restart Slave work")
  slave_thread_pool.submit(thread_executor).add_done_callback(executor_callback)

  logger.info("Master is going to sleep 5s")
  sleep(5)

总结

这个问题主要还是因为对 Python 的 concurrent.futuresthread.ThreadPoolExecutor 不够了解导致的,接触这个包是在书本上,但是书本没完全介绍包的全部 API 及用法,所以代码产生异常情况后,DEBUG 了许久在真正找到问题所在。查阅 python docs 后才对其完整用法有所认识,所以,以后学习新的 python 包的时候还是可以查一查官方文档的。

参考资料

英文版: docs of python concurrent.futures

中文版: python docs concurrent.futures — 启动并行任务

exception(timeout=None)

返回由调用引发的异常。如果调用还没完成那么这个方法将等待 timeout 秒。如果在 timeout 秒内没有执行完成,concurrent.futures.TimeoutError 将会被触发。timeout 可以是整数或浮点数。如果 timeout 没有指定或为 None,那么等待时间就没有限制。

如果 futrue 在完成前被取消则 CancelledError 将被触发。

如果调用正常完成那么返回 None。

add_done_callback(fn)

附加可调用 fn 到期程。当期程被取消或完成运行时,将会调用 fn,而这个期程将作为它唯一的参数。

加入的可调用对象总被属于添加它们的进程中的线程按加入的顺序调用。如果可调用对象引发一个 Exception 子类,它会被记录下来并被忽略掉。如果可调用对象引发一个 BaseException 子类,这个行为没有定义。

如果期程已经完成或已取消,fn 会被立即调用。

以上这篇解决python ThreadPoolExecutor 线程池中的异常捕获问题就是小编分享给大家的全部内容了,希望能给大家一个参考,也希望大家多多支持三水点靠木。

Python 相关文章推荐
使用Python的Scrapy框架编写web爬虫的简单示例
Apr 17 Python
python同时给两个收件人发送邮件的方法
Apr 30 Python
python爬取NUS-WIDE数据库图片
Oct 05 Python
利用python模拟sql语句对员工表格进行增删改查
Jul 05 Python
Python3.5 + sklearn利用SVM自动识别字母验证码方法示例
May 10 Python
python django框架中使用FastDFS分布式文件系统的安装方法
Jun 10 Python
python爬虫爬取笔趣网小说网站过程图解
Nov 18 Python
python使用pandas抽样训练数据中某个类别实例
Feb 28 Python
浅谈pymysql查询语句中带有in时传递参数的问题
Jun 05 Python
Python抓包并解析json爬虫的完整实例代码
Nov 03 Python
pycharm无法导入lxml的解决办法
Mar 31 Python
python3 实现mysql数据库连接池的示例代码
Apr 17 Python
使用Python将Exception异常错误堆栈信息写入日志文件
Apr 08 #Python
TensorFlow2.X结合OpenCV 实现手势识别功能
Apr 08 #Python
python 安装库几种方法之cmd,anaconda,pycharm详解
Apr 08 #Python
TensorFlow2.1.0最新版本安装详细教程
Apr 08 #Python
解决python多线程报错:AttributeError: Can't pickle local object问题
Apr 08 #Python
解决Python 异常TypeError: cannot concatenate 'str' and 'int' objects
Apr 08 #Python
TensorFlow2.1.0安装过程中setuptools、wrapt等相关错误指南
Apr 08 #Python
You might like
PHP fopen 读取带中文URL地址的一点见解
2012/09/25 PHP
smarty 缓存控制前的页面静态化原理
2013/03/15 PHP
php实现计数器方法小结
2015/01/05 PHP
PHP的命令行命令使用指南
2015/08/18 PHP
PHP多维数组元素操作类的方法
2016/11/14 PHP
php如何获取Http请求
2020/04/30 PHP
非常漂亮的JS代码经典广告
2007/10/21 Javascript
Web跨浏览器进程通信(Web跨域)
2013/04/17 Javascript
jQuery弹出(alert)select选择的值
2013/04/21 Javascript
JS 退出系统并跳转到登录界面的实现代码
2013/06/29 Javascript
jQuery 常用代码集锦(必看篇)
2016/05/16 Javascript
Google Maps基础及实例解析
2016/08/06 Javascript
jquery 校验中国身份证号码实例详解
2017/04/11 jQuery
Vue.js学习笔记之常用模板语法详解
2017/07/25 Javascript
如何让你的JS代码更好看易读
2017/12/01 Javascript
jQuery提示框插件SweetAlert用法分析
2019/08/05 jQuery
Vue实现购物车详情页面的方法
2019/08/20 Javascript
layui自定义工具栏的方法
2019/09/19 Javascript
[01:05:32]DOTA2上海特级锦标赛主赛事日 - 3 败者组第三轮#1COL VS Alliance第一局
2016/03/04 DOTA
[50:02]完美世界DOTA2联赛循环赛 Magma vs IO BO2第一场 11.01
2020/11/02 DOTA
Python命名空间详解
2014/08/18 Python
python使用正则表达式替换匹配成功的组并输出替换的次数
2017/11/22 Python
python中subprocess批量执行linux命令
2018/04/27 Python
Python 操作mysql数据库查询之fetchone(), fetchmany(), fetchall()用法示例
2019/10/17 Python
Python 实现训练集、测试集随机划分
2020/01/08 Python
解决ROC曲线画出来只有一个点的问题
2020/02/28 Python
使用python处理题库表格并转化为word形式的实现
2020/04/14 Python
python中threading开启关闭线程操作
2020/05/02 Python
Python基于pip实现离线打包过程详解
2020/05/15 Python
matplotlib 多个图像共用一个colorbar的实现示例
2020/09/10 Python
Python监听键盘和鼠标事件的示例代码
2020/11/18 Python
美国最大的骑马用品零售商:HorseLoverZ
2017/01/12 全球购物
Timberland俄罗斯官方网上商店:全球领先的户外品牌
2020/03/15 全球购物
什么是TCP/IP
2014/07/27 面试题
教你用python实现12306余票查询
2021/06/30 Python
Golang日志包的使用
2022/04/20 Golang