解决python ThreadPoolExecutor 线程池中的异常捕获问题


Posted in Python onApril 08, 2020

问题

最近写了涉及线程池及线程的 python 脚本,运行过程中发现一个有趣的现象,线程池中的工作线程出现问题,引发了异常,但是主线程没有捕获异常,还在发现 BUG 之前一度以为线程池代码正常返回。

先说重点

这里主要想介绍 python concurrent.futuresthread.ThreadPoolExecutor 线程池中的 worker 引发异常的时候,并不会直接向上抛起异常,而是需要主线程通过调用concurrent.futures.Future.exception(timeout=None) 方法主动获取 worker 的异常。

问题重现及解决

引子

问题主要由这样一段代码引起的:

def thread_executor():
 logger.info("I am slave. I am working. I am going to sleep 3s")
 sleep(3)
 logger.info("Exit thread executor")


def main():
 thread_obj = threading.Thread(target=thread_executor)
 while True:
  logger.info("Master starts thread worker")

  try:
   # 工作线程由于某种异常而结束并退出了,想重启工作线程的工作,但又不想重复创建线程
   thread_obj.start() # 这一行会报错,同一线程不能重复启动
  except Exception as e:
   logger.error("Master start thread error", exc_info=True)
   raise e

  logger.info("Master is going to sleep 5s")
  sleep(5)

上面这段代码的功能如注释中解释的,主要要实现类似生产者消费者的功能,工作线程一直去生产资源,主线程去消费工作线程生产的资源。但是工作线程由于异常推出了,想重新启动生产工作。显然,这个代码会报错。

运行结果:

thread: MainThread [INFO] Master starts thread worker
thread: Thread-1 [INFO] I am slave. I am working. I am going to sleep 3s
thread: MainThread [INFO] Master is going to sleep 5s
thread: Thread-1 [INFO] Exit thread executor because of some exception
thread: MainThread [INFO] Master starts thread worker
thread: MainThread [ERROR] Master start thread error
Traceback (most recent call last):
File "xxx.py", line 47, in main
 thread_obj.start()
File "E:\anaconda\lib\threading.py", line 843, in start
 raise RuntimeError("threads can only be started once")
RuntimeError: threads can only be started once
Traceback (most recent call last):
File "xxx.py", line 56, in <module>
 main()
File "xxx.py", line 50, in main
 raise e
File "xxx.py", line 47, in main
 thread_obj.start()
File "E:\anaconda\lib\threading.py", line 843, in start
 raise RuntimeError("threads can only be started once")
RuntimeError: threads can only be started once

切入正题

然而脚本还有其他业务代码要运行,所以需要把上面的资源生产和消费的代码放到一个线程里完成,所以引入线程池来执行这段代码:

def thread_executor():
 while True:
  logger.info("I am slave. I am working. I am going to sleep 3s")
  sleep(3)
  logger.info("Exit thread executor because of some exception")
  break


def main():
 thread_obj = threading.Thread(target=thread_executor)
 while True:
  logger.info("Master starts thread worker")

  # 工作线程由于某种异常而结束并退出了,想重启工作线程的工作,但又不想重复创建线程
  # 没有想到这里会有异常
  thread_obj.start() # 这一行会报错,同一线程不能重复启动

  logger.info("Master is going to sleep 5s")
  sleep(5)


def thread_pool_main():
 thread_obj = ThreadPoolExecutor(max_workers=1, thread_name_prefix="WorkExecutor")
 logger.info("Master ThreadPool Executor starts thread worker")
 thread_obj.submit(main)

 while True:
  logger.info("Master ThreadPool Executor is going to sleep 5s")
  sleep(5)

if __name__ == '__main__':
 thread_pool_main()

代码运行结果如下:

INFO [thread: MainThread] Master ThreadPool Executor starts thread worker
INFO [thread: WorkExecutor_0] Master starts thread worker
INFO [thread: MainThread] Master ThreadPool Executor is going to sleep 5s
INFO [thread: Thread-1] I am slave. I am working. I am going to sleep 3s
INFO [thread: WorkExecutor_0] Master is going to sleep 5s
INFO [thread: Thread-1] Exit thread executor because of some exception
INFO [thread: MainThread] Master ThreadPool Executor is going to sleep 5s
INFO [thread: WorkExecutor_0] Master starts thread worker
INFO [thread: MainThread] Master ThreadPool Executor is going to sleep 5s
INFO [thread: MainThread] Master ThreadPool Executor is going to sleep 5s
INFO [thread: MainThread] Master ThreadPool Executor is going to sleep 5s
INFO [thread: MainThread] Master ThreadPool Executor is going to sleep 5s
INFO [thread: MainThread] Master ThreadPool Executor is going to sleep 5s
INFO [thread: MainThread] Master ThreadPool Executor is going to sleep 5s
INFO [thread: MainThread] Master ThreadPool Executor is going to sleep 5s

... ...

显然,由上面的结果,在线程池 worker 执行到 INFO [thread: WorkExecutor_0] Master starts thread worker 的时候,是会有异常产生的,但是整个代码并没有抛弃任何异常。

解决方法

发现上面的 bug 后,想在线程池 worker 出错的时候,把异常记录到日志。查阅资料,要获取线程池的异常信息,需要调用 concurrent.futures.Future.exception(timeout=None) 方法,为了记录日志,这里加了线程池执行结束的回调函数。同时,日志中记录异常信息,用了 logging.exception() 方法。

def thread_executor():
 while True:
  logger.info("I am slave. I am working. I am going to sleep 3s")
  sleep(3)
  logger.info("Exit thread executor because of some exception")
  break


def main():
 thread_obj = threading.Thread(target=thread_executor)
 while True:
  logger.info("Master starts thread worker")

  # 工作线程由于某种异常而结束并退出了,想重启工作线程的工作,但又不想重复创建线程
  # 没有想到这里会有异常
  thread_obj.start() # 这一行会报错,同一线程不能重复启动

  logger.info("Master is going to sleep 5s")
  sleep(5)


def thread_pool_callback(worker):
 logger.info("called thread pool executor callback function")
 worker_exception = worker.exception()
 if worker_exception:
  logger.exception("Worker return exception: {}".format(worker_exception))


def thread_pool_main():
 thread_obj = ThreadPoolExecutor(max_workers=1, thread_name_prefix="WorkExecutor")
 logger.info("Master ThreadPool Executor starts thread worker")
 thread_pool_exc = thread_obj.submit(main)
 thread_pool_exc.add_done_callback(thread_pool_callback)
 # logger.info("thread pool exception: {}".format(thread_pool_exc.exception()))

 while True:
  logger.info("Master ThreadPool Executor is going to sleep 5s")
  sleep(5)


if __name__ == '__main__':
 thread_pool_main()

代码运行结果:

INFO [thread: MainThread] Master ThreadPool Executor starts thread worker
INFO [thread: WorkExecutor_0] Master starts thread worker
INFO [thread: MainThread] Master ThreadPool Executor is going to sleep 5s
INFO [thread: Thread-1] I am slave. I am working. I am going to sleep 3s
INFO [thread: WorkExecutor_0] Master is going to sleep 5s
INFO [thread: Thread-1] Exit thread executor because of some exception
INFO [thread: MainThread] Master ThreadPool Executor is going to sleep 5s
INFO [thread: WorkExecutor_0] Master starts thread worker
INFO [thread: WorkExecutor_0] called thread pool executor callback function
ERROR [thread: WorkExecutor_0] Worker return exception: threads can only be started once
Traceback (most recent call last):
File "E:\anaconda\lib\concurrent\futures\thread.py", line 57, in run
 result = self.fn(*self.args, **self.kwargs)
File "xxxx.py", line 46, in main
 thread_obj.start() # 这一行会报错,同一线程不能重复启动
File "E:\anaconda\lib\threading.py", line 843, in start
 raise RuntimeError("threads can only be started once")
RuntimeError: threads can only be started once
INFO [thread: MainThread] Master ThreadPool Executor is going to sleep 5s
INFO [thread: MainThread] Master ThreadPool Executor is going to sleep 5s
INFO [thread: MainThread] Master ThreadPool Executor is going to sleep 5s
... ...

最终的写法

其实,上面写法中,想重复利用一个线程去实现生产者线程的实现方法是有问题的,在此处,一般情况下,线程执行结束后,线程资源会被会被操作系统,所以线程不能被重复调用 start() 。

解决python ThreadPoolExecutor 线程池中的异常捕获问题

解决python ThreadPoolExecutor 线程池中的异常捕获问题

一种可行的实现方式就是,用线程池替代。当然,这样做得注意上面提到的线程池执行体的异常捕获问题。

def thread_executor():
 while True:
  logger.info("I am slave. I am working. I am going to sleep 3s")
  sleep(3)
  logger.info("Exit thread executor because of some exception")
  break

def executor_callback(worker):
 logger.info("called worker callback function")
 worker_exception = worker.exception()
 if worker_exception:
  logger.exception("Worker return exception: {}".format(worker_exception))
  # raise worker_exception


def main():
 slave_thread_pool = ThreadPoolExecutor(max_workers=1, thread_name_prefix="SlaveExecutor")
 restart_flag = False
 while True:
  logger.info("Master starts thread worker")

  if not restart_flag:
   restart_flag = not restart_flag
   logger.info("Restart Slave work")
  slave_thread_pool.submit(thread_executor).add_done_callback(executor_callback)

  logger.info("Master is going to sleep 5s")
  sleep(5)

总结

这个问题主要还是因为对 Python 的 concurrent.futuresthread.ThreadPoolExecutor 不够了解导致的,接触这个包是在书本上,但是书本没完全介绍包的全部 API 及用法,所以代码产生异常情况后,DEBUG 了许久在真正找到问题所在。查阅 python docs 后才对其完整用法有所认识,所以,以后学习新的 python 包的时候还是可以查一查官方文档的。

参考资料

英文版: docs of python concurrent.futures

中文版: python docs concurrent.futures — 启动并行任务

exception(timeout=None)

返回由调用引发的异常。如果调用还没完成那么这个方法将等待 timeout 秒。如果在 timeout 秒内没有执行完成,concurrent.futures.TimeoutError 将会被触发。timeout 可以是整数或浮点数。如果 timeout 没有指定或为 None,那么等待时间就没有限制。

如果 futrue 在完成前被取消则 CancelledError 将被触发。

如果调用正常完成那么返回 None。

add_done_callback(fn)

附加可调用 fn 到期程。当期程被取消或完成运行时,将会调用 fn,而这个期程将作为它唯一的参数。

加入的可调用对象总被属于添加它们的进程中的线程按加入的顺序调用。如果可调用对象引发一个 Exception 子类,它会被记录下来并被忽略掉。如果可调用对象引发一个 BaseException 子类,这个行为没有定义。

如果期程已经完成或已取消,fn 会被立即调用。

以上这篇解决python ThreadPoolExecutor 线程池中的异常捕获问题就是小编分享给大家的全部内容了,希望能给大家一个参考,也希望大家多多支持三水点靠木。

Python 相关文章推荐
python封装对象实现时间效果
Apr 23 Python
Python获取单个程序CPU使用情况趋势图
Mar 10 Python
python处理两种分隔符的数据集方法
Dec 12 Python
Django 对象关系映射(ORM)源码详解
Aug 06 Python
Python多线程爬取豆瓣影评API接口
Oct 22 Python
python函数装饰器之带参数的函数和带参数的装饰器用法示例
Nov 06 Python
Python批量将图片灰度化的实现代码
Apr 11 Python
Python优秀开源项目Rich源码解析的流程分析
Jul 06 Python
Python3爬虫mitmproxy的安装步骤
Jul 29 Python
关于python中remove的一些坑小结
Jan 04 Python
使paramiko库执行命令时在给定的时间强制退出功能的实现
Mar 03 Python
python APScheduler执行定时任务介绍
Apr 19 Python
使用Python将Exception异常错误堆栈信息写入日志文件
Apr 08 #Python
TensorFlow2.X结合OpenCV 实现手势识别功能
Apr 08 #Python
python 安装库几种方法之cmd,anaconda,pycharm详解
Apr 08 #Python
TensorFlow2.1.0最新版本安装详细教程
Apr 08 #Python
解决python多线程报错:AttributeError: Can't pickle local object问题
Apr 08 #Python
解决Python 异常TypeError: cannot concatenate 'str' and 'int' objects
Apr 08 #Python
TensorFlow2.1.0安装过程中setuptools、wrapt等相关错误指南
Apr 08 #Python
You might like
php 清除网页病毒的方法
2008/12/05 PHP
php设计模式 Proxy (代理模式)
2011/06/26 PHP
PHP多个文件上传到服务器实例
2014/10/29 PHP
ThinkPHP结合AjaxFileUploader实现无刷新文件上传的方法
2014/10/29 PHP
PHP+Ajax 检测网络是否正常实例详解
2016/12/16 PHP
PHP使用imagick扩展实现合并图像的方法
2017/04/25 PHP
thinkPHP5框架实现基于ajax的分页功能示例
2018/06/12 PHP
七种PHP开发环境搭建工具
2020/06/28 PHP
JavaScript 反科里化 this [译]
2012/09/20 Javascript
JS字符串处理实例代码
2013/08/05 Javascript
js+css 实现遮罩居中弹出层(随浏览器窗口滚动条滚动)
2013/12/11 Javascript
js实现简单的购物车有图有代码
2014/05/26 Javascript
JavaScript如何获取数组最大值和最小值
2015/11/18 Javascript
JavaScript基于DOM操作实现简单的数学运算功能示例
2017/01/16 Javascript
Node.js中用D3.js的方法示例
2017/01/16 Javascript
JavaScript实现AOP详解(面向切面编程,装饰者模式)
2017/12/19 Javascript
Angular使用过滤器uppercase/lowercase实现字母大小写转换功能示例
2018/03/27 Javascript
详解Vue.js中.native修饰符
2018/04/24 Javascript
vue的style绑定background-image的方式和其他变量数据的区别详解
2018/09/03 Javascript
[36:16]完美世界DOTA2联赛PWL S3 access vs Rebirth 第一场 12.19
2020/12/24 DOTA
python中split方法用法分析
2015/04/17 Python
Python函数any()和all()的用法及区别介绍
2018/09/14 Python
pyqt5 实现在别的窗口弹出进度条
2019/06/18 Python
细数nn.BCELoss与nn.CrossEntropyLoss的区别
2020/02/29 Python
H5 meta小结(前端必看篇)
2016/08/24 HTML / CSS
将时尚融入珠宝:Adornmonde
2019/10/17 全球购物
日语专业毕业生自荐信
2013/11/11 职场文书
酒店公关部经理岗位职责
2013/11/24 职场文书
后勤工作职责
2013/12/22 职场文书
公司企业表扬信
2014/01/11 职场文书
培训自我鉴定
2014/01/31 职场文书
领导干部学习十八届五中全会精神心得体会
2016/01/05 职场文书
2016暑期校本培训心得体会
2016/01/08 职场文书
磁贴还没死, 微软Win11可修改注册表找回Win10开始菜单
2021/11/21 数码科技
mysql全面解析json/数组
2022/07/07 MySQL
python计算列表元素与乘积详情
2022/08/05 Python