编程 Python

Django 大文件下载实现过程解析

Posted in Python onAugust 01, 2019

django提供文件下载时，若果文件较小，解决办法是先将要传送的内容全生成在内存中，然后再一次性传入Response对象中：

def simple_file_download(request):
  # do something...
  content = open("simplefile", "rb").read()

如果文件非常大时，最简单的办法就是使用静态文件服务器，比如Apache或者Nginx服务器来处理下载。不过有时候，我们需要对用户的权限做一下限定，或者不想向用户暴露文件的真实地址，或者这个大内容是临时生成的(比如临时将多个文件合并而成的)，这时就不能使用静态文件服务器了。

django文档中提到，可以向HttpResponse传递一个迭代器，流式的向客户端传递数据。

要自己写迭代器的话，可以用yield：

def read_file(filename, buf_size=8192):
  with open(filename, "rb") as f:
    while True:
      content = f.read(buf_size)
      if content:
        yield content
      else:
        break
def big_file_download(request):
  filename = "filename"
  response = HttpResponse(read_file(filename))
  return response

或者使用生成器表达式，下面是django文档中提供csv大文件下载的例子：

import csv
 
from django.utils.six.moves import range
from django.http import StreamingHttpResponse
 
class Echo(object):
  """An object that implements just the write method of the file-like
  interface.
  """
  def write(self, value):
    """Write the value by returning it, instead of storing in a buffer."""
    return value
 
def some_streaming_csv_view(request):
  """A view that streams a large CSV file."""
  # Generate a sequence of rows. The range is based on the maximum number of
  # rows that can be handled by a single sheet in most spreadsheet
  # applications.
  rows = (["Row {0}".format(idx), str(idx)] for idx in range(65536))
  pseudo_buffer = Echo()
  writer = csv.writer(pseudo_buffer)
  response = StreamingHttpResponse((writer.writerow(row) for row in rows),
                   content_type="text/csv")
  response['Content-Disposition'] = 'attachment; filename="somefilename.csv"'
  return response

python也提供一个文件包装器，将类文件对象包装成一个迭代器：

class FileWrapper:
  """Wrapper to convert file-like objects to iterables""" 
  def __init__(self, filelike, blksize=8192):
    self.filelike = filelike
    self.blksize = blksize
    if hasattr(filelike,'close'):
      self.close = filelike.close 
  def __getitem__(self,key):
    data = self.filelike.read(self.blksize)
    if data:
      return data
    raise IndexError 
  def __iter__(self):
    return self 
  def next(self):
    data = self.filelike.read(self.blksize)
    if data:
      return data
    raise StopIteration

使用时：

from django.core.servers.basehttp import FileWrapper
from django.http import HttpResponse
import os
def file_download(request,filename):
 
  wrapper = FileWrapper(open(filename, 'rb'))
  response = HttpResponse(wrapper, content_type='application/octet-stream')
  response['Content-Length'] = os.path.getsize(path)
  response['Content-Disposition'] = 'attachment; filename=%s' % filename
  return response

django也提供了StreamingHttpResponse类来代替HttpResponse对流数据进行处理。

压缩为zip文件下载：

import os, tempfile, zipfile 
from django.http import HttpResponse 
from django.core.servers.basehttp import FileWrapper 
def send_zipfile(request): 
  """                                     
  Create a ZIP file on disk and transmit it in chunks of 8KB,         
  without loading the whole file into memory. A similar approach can     
  be used for large dynamic PDF files.                    
  """ 
  temp = tempfile.TemporaryFile() 
  archive = zipfile.ZipFile(temp, 'w', zipfile.ZIP_DEFLATED) 
  for index in range(10): 
    filename = __file__ # Select your files here.              
    archive.write(filename, 'file%d.txt' % index) 
  archive.close() 
  wrapper = FileWrapper(temp) 
  response = HttpResponse(wrapper, content_type='application/zip') 
  response['Content-Disposition'] = 'attachment; filename=test.zip' 
  response['Content-Length'] = temp.tell() 
  temp.seek(0) 
  return response

不过不管怎么样，使用django来处理大文件下载都不是一个很好的注意，最好的办法是django做权限判断，然后让静态服务器处理下载。

这需要使用sendfile的机制："传统的Web服务器在处理文件下载的时候，总是先读入文件内容到应用程序内存，然后再把内存当中的内容发送给客户端浏览器。这种方式在应付当今大负载网站会消耗更多的服务器资源。sendfile是现代操作系统支持的一种高性能网络IO方式，操作系统内核的sendfile调用可以将文件内容直接推送到网卡的buffer当中，从而避免了Web服务器读写文件的开销，实现了“零拷贝”模式。 "

Apache服务器里需要mod_xsendfile模块来实现，而Nginx是通过称为X-Accel-Redirect的特性来实现。

nginx配置文件：

# Will serve /var/www/files/myfile.tar.gz
# When passed URI /protected_files/myfile.tar.gz
location /protected_files {
  internal;
  alias /var/www/files;
}

或者

# Will serve /var/www/protected_files/myfile.tar.gz
# When passed URI /protected_files/myfile.tar.gz
location /protected_files {
  internal;
  root /var/www;
}

注意alias和root的区别。

django中：

response['X-Accel-Redirect']='/protected_files/%s'%filename

这样当向django view函数发起request时，django负责对用户权限进行判断或者做些其它事情，然后向nginx转发url为/protected_files/filename的请求，nginx服务器负责文件/var/www/protected_files/filename的下载：

@login_required
def document_view(request, document_id):
  book = Book.objects.get(id=document_id)
  response = HttpResponse()
  name=book.myBook.name.split('/')[-1]
  response['Content_Type']='application/octet-stream'
  response["Content-Disposition"] = "attachment; filename={0}".format(
      name.encode('utf-8'))
  response['Content-Length'] = os.path.getsize(book.myBook.path)
  response['X-Accel-Redirect'] = "/protected/{0}".format(book.myBook.name)
  return response

以上就是本文的全部内容，希望对大家的学习有所帮助，也希望大家多多支持三水点靠木。

Django 大文件下载实现过程解析

- Author -

再见紫罗兰

声明：登载此文出于传递更多信息之目的，并不意味着赞同其观点或证实其描述。

Python 相关文章推荐

使用Python下载Bing图片（代码）

Nov 07 Python

python脚本实现分析dns日志并对受访域名排行

Sep 18 Python

Python实现基于HTTP文件传输实例

Nov 08 Python

python使用三角迭代计算圆周率PI的方法

Mar 20 Python

Python中的多重装饰器

Apr 11 Python

Python内置函数dir详解

Apr 14 Python

Python基于pygame实现图片代替鼠标移动效果

Nov 11 Python

Mac中升级Python2.7到Python3.5步骤详解

Apr 27 Python

详解如何用OpenCV + Python 实现人脸识别

Oct 20 Python

Python的地形三维可视化Matplotlib和gdal使用实例

Dec 09 Python

python正则实现提取电话功能

Feb 24 Python

使用Keras中的ImageDataGenerator进行批次读图方式

Jun 17 Python

python爬虫刷访问量 2019 7月

Aug 01 #Python

用Cython加速Python到“起飞”(推荐)

Aug 01 #Python

Python爬取视频(其实是一篇福利)过程解析

Aug 01 #Python

flask框架jinja2模板与模板继承实例分析

Aug 01 #Python

Win10环境python3.7安装dlib模块趟过的坑

Aug 01 #Python

python爬虫解决验证码的思路及示例

Aug 01 #Python

Django多数据库的实现过程详解

Aug 01 #Python

You might like

使用PHP数组实现无限分类，不使用数据库，不使用递归.

2006/12/09 PHP

php防注入及开发安全详细解析

2013/08/09 PHP

深入浅析yii2-gii自定义模板的方法

2016/04/26 PHP

Linux平台PHP5.4设置FPM线程数量的方法

2016/11/09 PHP

Laravel Intervention/image图片处理扩展包的安装、使用与可能遇到的坑详解

2017/11/14 PHP

载入进度条效果

2006/07/08 Javascript

JS实现浏览器菜单命令

2006/09/05 Javascript

javascript淡入淡出效果的实现思路

2012/03/31 Javascript

js过滤HTML标签完整实例

2015/11/26 Javascript

jQuery插件JWPlayer视频播放器用法实例分析

2017/01/11 Javascript

JQuery Dialog对话框不能通过Esc关闭的原因分析及解决办法

2017/01/18 Javascript

利用node.js本地搭建HTTP服务器

2017/04/19 Javascript

JS组件系列之MVVM组件 vue 30分钟搞定前端增删改查

2017/04/28 Javascript

JS实现微信摇一摇原理解析

2017/07/22 Javascript

js的函数的按值传递参数(实例讲解)

2017/11/16 Javascript

javaScript中的空值和假值

2017/12/18 Javascript

Angular利用HTTP POST下载流文件的步骤记录

2020/07/26 Javascript

[54:28]EG vs OG 2019国际邀请赛小组赛 BO2 第一场 8.16

2019/08/18 DOTA

朴素贝叶斯算法的python实现方法

2014/11/18 Python

Python爬虫包 BeautifulSoup 递归抓取实例详解

2017/01/28 Python

python批量下载网站马拉松照片的完整步骤

2018/12/05 Python

Python箱型图处理离群点的例子

2019/12/09 Python

python pip安装包出现:Failed building wheel for xxx错误的解决

2019/12/25 Python

python+opencv实现移动侦测（帧差法）

2020/03/20 Python

python3通过udp实现组播数据的发送和接收操作

2020/05/05 Python

安装python依赖包psycopg2来调用postgresql的操作

2021/01/01 Python

HTML5边玩边学（3）像素和颜色

2010/09/21 HTML / CSS

移动端Html5页面生成图片解决方案

2018/08/07 HTML / CSS

英国领先的大码时装品牌之一：Elvi

2018/08/26 全球购物

亿企通软件测试面试题

2012/04/10 面试题

毕业生文员求职信

2013/11/03 职场文书

高三励志标语

2014/06/05 职场文书

美化环境标语

2014/06/20 职场文书

初中生毕业评语

2014/12/29 职场文书

小学数学教学反思范文

2016/02/16 职场文书

Jsonp劫持学习

2021/04/01 PHP