使用Python的Twisted框架编写非阻塞程序的代码示例


Posted in Python onMay 25, 2016

先来看一段代码:

# ~*~ Twisted - A Python tale ~*~

from time import sleep

# Hello, I'm a developer and I mainly setup Wordpress.
def install_wordpress(customer):
  # Our hosting company Threads Ltd. is bad. I start installation and...
  print "Start installation for", customer
  # ...then wait till the installation finishes successfully. It is
  # boring and I'm spending most of my time waiting while consuming
  # resources (memory and some CPU cycles). It's because the process
  # is *blocking*.
  sleep(3)
  print "All done for", customer

# I do this all day long for our customers
def developer_day(customers):
  for customer in customers:
    install_wordpress(customer)

developer_day(["Bill", "Elon", "Steve", "Mark"])

运行一下,结果如下所示:

$ ./deferreds.py 1
------ Running example 1 ------
Start installation for Bill
All done for Bill
Start installation
...
* Elapsed time: 12.03 seconds

这是一段顺序执行的代码。四个消费者,为一个人安装需要3秒的时间,那么四个人就是12秒。这样处理不是很令人满意,所以看一下第二个使用了线程的例子:

import threading

# The company grew. We now have many customers and I can't handle the
# workload. We are now 5 developers doing exactly the same thing.
def developers_day(customers):
  # But we now have to synchronize... a.k.a. bureaucracy
  lock = threading.Lock()
  #
  def dev_day(id):
    print "Goodmorning from developer", id
    # Yuck - I hate locks...
    lock.acquire()
    while customers:
      customer = customers.pop(0)
      lock.release()
      # My Python is less readable
      install_wordpress(customer)
      lock.acquire()
    lock.release()
    print "Bye from developer", id
  # We go to work in the morning
  devs = [threading.Thread(target=dev_day, args=(i,)) for i in range(5)]
  [dev.start() for dev in devs]
  # We leave for the evening
  [dev.join() for dev in devs]

# We now get more done in the same time but our dev process got more
# complex. As we grew we spend more time managing queues than doing dev
# work. We even had occasional deadlocks when processes got extremely
# complex. The fact is that we are still mostly pressing buttons and
# waiting but now we also spend some time in meetings.
developers_day(["Customer %d" % i for i in xrange(15)])

运行一下:

$ ./deferreds.py 2
------ Running example 2 ------
Goodmorning from developer 0Goodmorning from developer
1Start installation forGoodmorning from developer 2
Goodmorning from developer 3Customer 0
...
from developerCustomer 13 3Bye from developer 2
* Elapsed time: 9.02 seconds

这次是一段并行执行的代码,使用了5个工作线程。15个消费者每个花费3s意味着总共45s的时间,不过用了5个线程并行执行总共只花费了9s的时间。这段代码有点复杂,很大一部分代码是用于管理并发,而不是专注于算法或者业务逻辑。另外,程序的输出结果看起来也很混杂,可读性也天津市。即使是简单的多线程的代码同样也难以写得很好,所以我们转为使用Twisted:

# For years we thought this was all there was... We kept hiring more
# developers, more managers and buying servers. We were trying harder
# optimising processes and fire-fighting while getting mediocre
# performance in return. Till luckily one day our hosting
# company decided to increase their fees and we decided to
# switch to Twisted Ltd.!

from twisted.internet import reactor
from twisted.internet import defer
from twisted.internet import task

# Twisted has a slightly different approach
def schedule_install(customer):
  # They are calling us back when a Wordpress installation completes.
  # They connected the caller recognition system with our CRM and
  # we know exactly what a call is about and what has to be done next.
  #
  # We now design processes of what has to happen on certain events.
  def schedule_install_wordpress():
      def on_done():
        print "Callback: Finished installation for", customer
    print "Scheduling: Installation for", customer
    return task.deferLater(reactor, 3, on_done)
  #
  def all_done(_):
    print "All done for", customer
  #
  # For each customer, we schedule these processes on the CRM
  # and that
  # is all our chief-Twisted developer has to do
  d = schedule_install_wordpress()
  d.addCallback(all_done)
  #
  return d

# Yes, we don't need many developers anymore or any synchronization.
# ~~ Super-powered Twisted developer ~~
def twisted_developer_day(customers):
  print "Goodmorning from Twisted developer"
  #
  # Here's what has to be done today
  work = [schedule_install(customer) for customer in customers]
  # Turn off the lights when done
  join = defer.DeferredList(work)
  join.addCallback(lambda _: reactor.stop())
  #
  print "Bye from Twisted developer!"
# Even his day is particularly short!
twisted_developer_day(["Customer %d" % i for i in xrange(15)])

# Reactor, our secretary uses the CRM and follows-up on events!
reactor.run()

运行结果:

------ Running example 3 ------
Goodmorning from Twisted developer
Scheduling: Installation for Customer 0
....
Scheduling: Installation for Customer 14
Bye from Twisted developer!
Callback: Finished installation for Customer 0
All done for Customer 0
Callback: Finished installation for Customer 1
All done for Customer 1
...
All done for Customer 14
* Elapsed time: 3.18 seconds

这次我们得到了完美的执行代码和可读性强的输出结果,并且没有使用线程。我们并行地处理了15个消费者,也就是说,本来需要45s的执行时间在3s之内就已经完成。这个窍门就是我们把所有的阻塞的对sleep()的调用都换成了Twisted中对等的task.deferLater()和回调函数。由于现在处理的操作在其他地方进行,我们就可以毫不费力地同时服务于15个消费者。
前面提到处理的操作发生在其他的某个地方。现在来解释一下,算术运算仍然发生在CPU内,但是现在的CPU处理速度相比磁盘和网络操作来说非常快。所以给CPU提供数据或者从CPU向内存或另一个CPU发送数据花费了大多数时间。我们使用了非阻塞的操作节省了这方面的时间,例如,task.deferLater()使用了回调函数,当数据已经传输完成的时候会被激活。
另一个很重要的一点是输出中的Goodmorning from Twisted developer和Bye from Twisted developer!信息。在代码开始执行时就已经打印出了这两条信息。如果代码如此早地执行到了这个地方,那么我们的应用真正开始运行是在什么时候呢?答案是,对于一个Twisted应用(包括Scrapy)来说是在reactor.run()里运行的。在调用这个方法之前,必须把应用中可能用到的每个Deferred链准备就绪,然后reactor.run()方法会监视并激活回调函数。
注意,reactor的主要一条规则就是,你可以执行任何操作,只要它足够快并且是非阻塞的。
现在好了,代码中没有那么用于管理多线程的部分了,不过这些回调函数看起来还是有些杂乱。可以修改成这样:

# Twisted gave us utilities that make our code way more readable!
@defer.inlineCallbacks
def inline_install(customer):
  print "Scheduling: Installation for", customer
  yield task.deferLater(reactor, 3, lambda: None)
  print "Callback: Finished installation for", customer
  print "All done for", customer

def twisted_developer_day(customers):
  ... same as previously but using inline_install() instead of schedule_install()

twisted_developer_day(["Customer %d" % i for i in xrange(15)])
reactor.run()

运行的结果和前一个例子相同。这段代码的作用和上一个例子是一样的,但是看起来更加简洁明了。inlineCallbacks生成器可以使用一些一些Python的机制来使得inline_install()函数暂停或者恢复执行。inline_install()函数变成了一个Deferred对象并且并行地为每个消费者运行。每次yield的时候,运行就会中止在当前的inline_install()实例上,直到yield的Deferred对象完成后再恢复运行。
现在唯一的问题是,如果我们不止有15个消费者,而是有,比如10000个消费者时又该怎样?这段代码会同时开始10000个同时执行的序列(比如HTTP请求、数据库的写操作等等)。这样做可能没什么问题,但也可能会产生各种失败。在有巨大并发请求的应用中,例如Scrapy,我们经常需要把并发的数量限制到一个可以接受的程度上。在下面的一个例子中,我们使用task.Cooperator()来完成这样的功能。Scrapy在它的Item Pipeline中也使用了相同的机制来限制并发的数目(即CONCURRENT_ITEMS设置):

@defer.inlineCallbacks
def inline_install(customer):
  ... same as above

# The new "problem" is that we have to manage all this concurrency to
# avoid causing problems to others, but this is a nice problem to have.
def twisted_developer_day(customers):
  print "Goodmorning from Twisted developer"
  work = (inline_install(customer) for customer in customers)
  #
  # We use the Cooperator mechanism to make the secretary not
  # service more than 5 customers simultaneously.
  coop = task.Cooperator()
  join = defer.DeferredList([coop.coiterate(work) for i in xrange(5)])
  #
  join.addCallback(lambda _: reactor.stop())
  print "Bye from Twisted developer!"

twisted_developer_day(["Customer %d" % i for i in xrange(15)])
reactor.run()

# We are now more lean than ever, our customers happy, our hosting
# bills ridiculously low and our performance stellar.
# ~*~ THE END ~*~

运行结果:

$ ./deferreds.py 5
------ Running example 5 ------
Goodmorning from Twisted developer
Bye from Twisted developer!
Scheduling: Installation for Customer 0
...
Callback: Finished installation for Customer 4
All done for Customer 4
Scheduling: Installation for Customer 5
...
Callback: Finished installation for Customer 14
All done for Customer 14
* Elapsed time: 9.19 seconds

从上面的输出中可以看到,程序运行时好像有5个处理消费者的槽。除非一个槽空出来,否则不会开始处理下一个消费者的请求。在本例中,处理时间都是3秒,所以看起来像是5个一批次地处理一样。最后得到的性能跟使用线程是一样的,但是这次只有一个线程,代码也更加简洁更容易写出正确的代码。

PS:deferToThread使同步函数实现非阻塞
wisted的defer.Deferred (from twisted.internet import defer)可以返回一个deferred对象.

注:deferToThread使用线程实现的,不推荐过多使用
***把同步函数变为异步(返回一个Deferred)***
twisted的deferToThread(from twisted.internet.threads import deferToThread)也返回一个deferred对象,不过回调函数在另一个线程处理,主要用于数据库/文件读取操作

..

# 代码片段

  def dataReceived(self, data):
    now = int(time.time())

    for ftype, data in self.fpcodec.feed(data):
      if ftype == 'oob':
        self.msg('OOB:', repr(data))
      elif ftype == 0x81: # 对服务器请求的心跳应答(这个是解析 防疲劳驾驶仪,发给gps上位机的,然后上位机发给服务器的)
        self.msg('FP.PONG:', repr(data))
      else:
        self.msg('TODO:', (ftype, data))
      d = deferToThread(self.redis.zadd, "beier:fpstat:fps", now, self.devid)
      d.addCallback(self._doResult, extra)

下面这儿完整的例子可以给大家参考一下

# -*- coding: utf-8 -*-

from twisted.internet import defer, reactor
from twisted.internet.threads import deferToThread

import functools
import time

# 耗时操作 这是一个同步阻塞函数
def mySleep(timeout):
  time.sleep(timeout)

  # 返回值相当于加进了callback里
  return 3 

def say(result):
  print "耗时操作结束了, 并把它返回的结果给我了", result

# 用functools.partial包装一下, 传递参数进去
cb = functools.partial(mySleep, 3)
d = deferToThread(cb) 
d.addCallback(say)

print "你还没有结束我就执行了, 哈哈"

reactor.run()
Python 相关文章推荐
python 查找文件夹下所有文件 实现代码
Jul 01 Python
Python写的一个定时重跑获取数据库数据
Dec 28 Python
浅谈python for循环的巧妙运用(迭代、列表生成式)
Sep 26 Python
python数据抓取分析的示例代码(python + mongodb)
Dec 25 Python
pandas分别写入excel的不同sheet方法
Dec 11 Python
Python实现 版本号对比功能的实例代码
Apr 18 Python
python中update的基本使用方法详解
Jul 17 Python
结合OpenCV与TensorFlow进行人脸识别的实现
Oct 10 Python
Python 2种方法求某个范围内的所有素数(质数)
Jan 31 Python
python实现图片横向和纵向拼接
Mar 05 Python
python将dict中的unicode打印成中文实例
May 11 Python
python四个坐标点对图片区域最小外接矩形进行裁剪
Jun 04 Python
Python的Twisted框架中使用Deferred对象来管理回调函数
May 25 #Python
使用Python的Twisted框架构建非阻塞下载程序的实例教程
May 25 #Python
Python的Twisted框架上手前所必须了解的异步编程思想
May 25 #Python
Python的re模块正则表达式操作
May 25 #Python
Python的for和break循环结构中使用else语句的技巧
May 24 #Python
Python3连接MySQL(pymysql)模拟转账实现代码
May 24 #Python
用Python写一个无界面的2048小游戏
May 24 #Python
You might like
页面乱码问题的根源及其分析
2013/08/09 PHP
Thinkphp微信公众号支付接口
2016/08/04 PHP
PHP利用超级全局变量$_POST来接收表单数据的实例
2016/11/05 PHP
PHP的反射机制实例详解
2017/03/29 PHP
Thinkphp5 微信公众号token验证不成功的原因及解决方法
2017/11/12 PHP
JavaScript实现自己的DOM选择器原理及代码
2013/03/04 Javascript
Javascript基础知识(二)事件
2014/09/29 Javascript
node.js中的console.assert方法使用说明
2014/12/10 Javascript
jQuery中nextAll()方法用法实例
2015/01/07 Javascript
JQuery记住用户名密码实现下次自动登录功能
2015/04/27 Javascript
javascript中substring()、substr()、slice()的区别
2015/08/30 Javascript
通过正则表达式获取url中参数的简单实现
2016/06/07 Javascript
Bootstrap轮播插件简单使用方法介绍
2016/06/21 Javascript
bootstrapValidator自定验证方法写法
2016/12/01 Javascript
jQuery插件HighCharts实现的2D条状图效果示例【附demo源码下载】
2017/03/15 Javascript
超级简易的JS计算器实例讲解(实现加减乘除)
2017/08/08 Javascript
jQuery实现的文字逐行向上间歇滚动效果示例
2017/09/06 jQuery
javascript计算对象长度的方法
2017/10/25 Javascript
jQuery实现碰到边缘反弹的动画效果
2018/02/24 jQuery
webpack4 配置 ssr 环境遇到“document is not defined”
2019/10/24 Javascript
Android基于TCP和URL协议的网络编程示例【附demo源码下载】
2018/01/23 Python
numpy添加新的维度:newaxis的方法
2018/08/02 Python
pycharm在调试python时执行其他语句的方法
2018/11/29 Python
解决django服务器重启端口被占用的问题
2019/07/26 Python
python数组循环处理方法
2019/08/26 Python
python基于socket实现的UDP及TCP通讯功能示例
2019/11/01 Python
PyCharm刷新项目(文件)目录的实现
2020/02/14 Python
HTML5之SVG 2D入门8—文档结构及相关元素总结
2013/01/30 HTML / CSS
澳大利亚最超值的自行车之家:Reid Cycles
2019/03/24 全球购物
不拖欠农民工工资承诺书
2014/03/31 职场文书
业务员自荐信范文
2014/04/20 职场文书
初中同学会活动方案
2014/08/22 职场文书
诚实守信演讲稿
2014/09/01 职场文书
2014年质检工作总结
2014/11/26 职场文书
配置nginx负载均衡
2022/05/06 Servers
MySQL生成千万测试数据以及遇到的问题
2022/08/05 MySQL