Python中Cookies导出某站用户数据的方法


Posted in Python onMay 17, 2021

应朋友需要,想将某客户的数据从某站里导出,先去某站搞个账号,建几条数据观察一番,心里有底后开搞。

1.Python环境搭建

之前电脑有安装过PyCharm Community 2019.1,具体安装过程就不写了,先跑个HelloWorld,输出正常后正式开整。

2.利用抓包工具或者Google浏览器调试模式拿到请求参数

Cookies参数如下:

cookies = {    
    'JSESSIONID': 'XXX',
    'phone': 'XXX',    
    'password': 'XXX',    
    'isAuto': '0',    '
    loginAccess': 'XXX'
}

headers请求头信息构造:

headers = {    
'Connection': 'keep-alive',    
'sec-ch-ua': '"Google Chrome";v="89", "Chromium";v="89", ";Not A Brand";v="99"',   
'Accept': 'application/json, text/javascript, */*; q=0.01',    'X-Requested-With': 'XMLHttpRequest',    'sec-ch-ua-mobile': '?0',    
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) 
AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.90 
Safari/537.36',    
'Content-Type': 'application/json',   
'Sec-Fetch-Site': 'same-origin',    
'Sec-Fetch-Mode': 'cors',    
'Sec-Fetch-Dest': 'empty',    
'Referer': 'https://xxx.xxx.xxx',    
'Accept-Language': 'zh-CN,zh;q=0.9',}

请求路径:

params = ( ('method', 'getGoodsList'))

请求参数组装:

data = '{
"pageSize":1000,
"pageNumber":1,
"searchText":"",
"sortOrder":"asc",
"isAdvancedSearch":false}'

pageSize:每页数据数量
pageNumber:页码
searchText:搜索条件
sortOrder:排序

3.利用Requests模拟请求并获取数据

response = requests.post(
   'https://xxx.xxx.xxx', 
    headers=headers,
    params=params, 
    cookies=cookies,
    data=data
)
print(response.text.encode('utf8'))
res = json.loads(response.text)

4.创建Excel表格

t = time.time()
randStr = int(round(t * 1000))
tSheetName = "a_" + str(randStr) + ".xlsx"
workbook = xlsxwriter.Workbook(tSheetName)
worksheet = workbook.add_worksheet()

5.表头及数据组装

cursor = 0
picurl = ''
writeExcel(row=cursor)
for obj in res["rows"]:    
    cursor += 1    
    picurl = ''   
    if obj['ImageKID']:        
        picurl = 'https://xxx.xxx.xxx? imageKid='+obj['ImageKID']    
    writeExcel(row=cursor,Description=obj['Description'], Category=obj['Category'], 		 		  Series=obj['Series'],BaseUnit=obj['BaseUnit'],Qty=obj['Qty'],
    CostPrice=obj['CostPrice'],SalePrice=obj['SalePrice'],                    
   RetailPrice=obj['RetailPrice'],Barcode=obj['Barcode'],
   Remark=obj['Remark'], ImageKID=picurl)

6.将数据写入Excel表格中

def writeExcel(row=0, Description='', Category='', Series='', BaseUnit='', Qty='', CostPrice='', SalePrice='', RetailPrice='', Barcode='', Remark='',ImageKID=''):   
	if row == 0:        
		worksheet.write(row, 0, '名称')        
		worksheet.write(row, 1, '货号')        
		worksheet.write(row, 2, '规格')        
		worksheet.write(row, 3, '单位')        
		worksheet.write(row, 4, '库存')        
		worksheet.write(row, 5, '成本')        
		worksheet.write(row, 6, '批发价')        
		worksheet.write(row, 7, '零售价')       
		worksheet.write(row, 8, '条码')        
		worksheet.write(row, 9, '备注')        
		worksheet.write(row, 10, '图片')        
	else:        
   		 if ImageKID!='':            
        		image_data = io.BytesIO(urllib.urlopen(ImageKID).read())            
        		worksheet.insert_image(row, 10, ImageKID, {'image_data': image_data})        
		worksheet.write(row, 0, Description)        
		worksheet.write(row, 1, Category)        
		worksheet.write(row, 2, Series)       
		worksheet.write(row, 3, BaseUnit)        
		worksheet.write(row, 4, Qty)        
		worksheet.write(row, 5, CostPrice)       
		worksheet.write(row, 6, SalePrice)       
		worksheet.write(row, 7, RetailPrice)       
		worksheet.write(row, 8, Barcode)        
		worksheet.write(row, 9, Remark)        
		worksheet.set_column(10, 10, 23)        
		worksheet.set_row(row, 150)

注意图片路径不存在的情况,否则会执行异常

write方法:

def write(self, row, col, *args):
        """
        Write data to a worksheet cell by calling the appropriate write_*()
        method based on the type of data being passed.

        Args:
            row:   The cell row (zero indexed).
            col:   The cell column (zero indexed).
            *args: Args to pass to sub functions.

        Returns:
             0:    Success.
            -1:    Row or column is out of worksheet bounds.
            other: Return value of called method.

        """
        return self._write(row, col, *args)

通过set_row方法设置表格行高

def set_row(self, row, height=None, cell_format=None, options=None):
        """
        Set the width, and other properties of a row.

        Args:
            row:         Row number (zero-indexed).
            height:      Row height. (optional).
            cell_format: Row cell_format. (optional).
            options:     Dict of options such as hidden, level and collapsed.

        Returns:
            0:  Success.
            -1: Row number is out of worksheet bounds.
		......
        """

通过set_column方法设置图片列宽度:

def set_column(self, first_col, last_col, width=None, cell_format=None,
                   options=None):
        """
        Set the width, and other properties of a single column or a
        range of columns.

        Args:
            first_col:   First column (zero-indexed).
            last_col:    Last column (zero-indexed). Can be same as first_col.
            width:       Column width. (optional).
            cell_format: Column cell_format. (optional).
            options:     Dict of options such as hidden and level.

        Returns:
            0:  Success.
            -1: Column number is out of worksheet bounds.
      ......

        """

通过insert_image插入网络图片:

def insert_image(self, row, col, filename, options=None):
        """
        Insert an image with its top-left corner in a worksheet cell.

        Args:
            row:      The cell row (zero indexed).
            col:      The cell column (zero indexed).
            filename: Path and filename for image in PNG, JPG or BMP format.
            options:  Position, scale, url and data stream of the image.

        Returns:
            0:  Success.
            -1: Row or column is out of worksheet bounds.

        """
        # Check insert (row, col) without storing.
        if self._check_dimensions(row, col, True, True):
            warn('Cannot insert image at (%d, %d).' % (row, col))
            return -1

        if options is None:
            options = {}

        x_offset = options.get('x_offset', 0)
        y_offset = options.get('y_offset', 0)
        x_scale = options.get('x_scale', 1)
        y_scale = options.get('y_scale', 1)
        url = options.get('url', None)
        tip = options.get('tip', None)
        anchor = options.get('object_position', 2)
        image_data = options.get('image_data', None)
        description = options.get('description', None)
        decorative = options.get('decorative', False)

        # For backward compatibility with older parameter name.
        anchor = options.get('positioning', anchor)

        if not image_data and not os.path.exists(filename):
            warn("Image file '%s' not found." % force_unicode(filename))
            return -1

        self.images.append([row, col, filename, x_offset, y_offset,
                            x_scale, y_scale, url, tip, anchor, image_data,
                            description, decorative])
        return 0

注意insert_image(row, colunmNum, ‘xx.png', {‘url': xxx})并不能插入网络图片,只是给本地图片一个url路径

7.关闭表格

workbook.close()

8.附引入的包

# -*- coding: UTF-8 -*-
# 批量获取XX数据
import io
import json 
import requests
import sys
import xlsxwriter
import time
import urllib

9.代码跑起来

Python中Cookies导出某站用户数据的方法

在看下Excel表格中导出的信息

Python中Cookies导出某站用户数据的方法

到此这篇关于Python中Cookies导出某站用户数据的方法的文章就介绍到这了,更多相关Python Cookies导出数据内容请搜索三水点靠木以前的文章或继续浏览下面的相关文章希望大家以后多多支持三水点靠木!

Python 相关文章推荐
用Python创建声明性迷你语言的教程
Apr 13 Python
使用基于Python的Tornado框架的HTTP客户端的教程
Apr 24 Python
python中根据字符串调用函数的实现方法
Jun 12 Python
Python语言描述机器学习之Logistic回归算法
Dec 21 Python
numpy.linspace 生成等差数组的方法
Jul 02 Python
使用Py2Exe for Python3创建自己的exe程序示例
Oct 31 Python
Python面向对象程序设计构造函数和析构函数用法分析
Apr 12 Python
Python 读取xml数据,cv2裁剪图片实例
Mar 10 Python
python脚本第一行如何写
Aug 30 Python
Python JSON常用编解码方法代码实例
Sep 05 Python
python基础之文件处理知识总结
May 23 Python
一篇文章带你了解Python和Java的正则表达式对比
Sep 15 Python
Python 高级库15 个让新手爱不释手(推荐)
Python带你从浅入深探究Tuple(基础篇)
May 15 #Python
Python中zipfile压缩包模块的使用
python 制作一个gui界面的翻译工具
pyqt5打包成exe可执行文件的方法
Python 机器学习工具包SKlearn的安装与使用
python process模块的使用简介
May 14 #Python
You might like
DC四月将推出百页特刊漫画 纪念小丑诞生80周年
2020/04/09 欧美动漫
php数据库抽象层 PDO
2011/05/07 PHP
PHP+Memcache实现wordpress访问总数统计(非插件)
2014/07/04 PHP
PHP二维索引数组的遍历实例分析【2种方式】
2019/06/24 PHP
Laravel 5.5 异常处理 & 错误日志的解决
2019/10/17 PHP
tagName的使用,留一笔
2006/06/26 Javascript
SharePoint 客户端对象模型 (一) ECMA Script
2011/05/22 Javascript
extjs tabpanel限制选项卡数量实现思路及代码
2013/04/02 Javascript
js对table的td进行相同内容合并示例详解
2013/12/27 Javascript
jQuery操作表单常用控件方法小结
2015/03/23 Javascript
基于jQuery仿淘宝产品图片放大镜特效
2020/10/19 Javascript
jQuery实现选项联动轮播效果【附实例】
2016/04/19 Javascript
JavaScript数组方法大全(推荐)
2016/07/05 Javascript
jquery操作ID带有变量的节点实例
2016/12/07 Javascript
详解vue-router 命名路由和命名视图
2018/06/01 Javascript
微信小程序左滑删除实现代码实例
2019/09/16 Javascript
如何搭建一个完整的Vue3.0+ts的项目步骤
2020/10/18 Javascript
[02:44]DOTA2英雄基础教程 钢背兽
2013/12/19 DOTA
[50:59]2018DOTA2亚洲邀请赛 4.7 总决赛 LGD vs Mineski第四场
2018/04/10 DOTA
python中执行shell命令的几个方法小结
2014/09/18 Python
python 内置函数filter
2017/06/01 Python
Python实现PS图像调整之对比度调整功能示例
2018/01/26 Python
Python TestCase中的断言方法介绍
2019/05/02 Python
海蓝之谜(LA MER)澳大利亚官方商城:全球高端奢华护肤品牌
2017/10/27 全球购物
英国最大的经认证的有机超市:Planet Organic
2018/02/02 全球购物
澳大利亚在线时尚精品店:Hello Molly
2018/02/26 全球购物
NYX Professional Makeup俄罗斯官网:世界知名的化妆品品牌
2019/12/26 全球购物
命名空间(namespace)和程序集(Assembly)有什么区别
2015/09/25 面试题
医学院护理专业应届生求职信
2013/11/12 职场文书
内刊编辑求职自荐书范文
2014/02/19 职场文书
春节联欢会策划方案
2014/05/16 职场文书
党员学习党的群众路线思想汇报(5篇)
2014/09/10 职场文书
乡镇防汛工作汇报
2014/10/28 职场文书
2014年全国法制宣传日宣传活动方案
2014/11/02 职场文书
党员活动总结
2015/02/04 职场文书
用Python爬取各大高校并可视化帮弟弟选大学,弟弟直呼牛X
2021/06/11 Python