编程 Python

pytorch 状态字典:state_dict使用详解

Posted in Python onJanuary 17, 2020

pytorch 中的 state_dict 是一个简单的python的字典对象,将每一层与它的对应参数建立映射关系.(如model的每一层的weights及偏置等等)

(注意,只有那些参数可以训练的layer才会被保存到模型的state_dict中,如卷积层,线性层等等)

优化器对象Optimizer也有一个state_dict,它包含了优化器的状态以及被使用的超参数(如lr, momentum,weight_decay等)

备注：

1) state_dict是在定义了model或optimizer之后pytorch自动生成的,可以直接调用.常用的保存state_dict的格式是".pt"或'.pth'的文件,即下面命令的 PATH="./***.pt"

torch.save(model.state_dict(), PATH)

2) load_state_dict 也是model或optimizer之后pytorch自动具备的函数,可以直接调用

model = TheModelClass(*args, **kwargs)
model.load_state_dict(torch.load(PATH))
model.eval()

注意：model.eval() 的重要性,在2)中最后用到了model.eval(),是因为,只有在执行该命令后,"dropout层"及"batch normalization层"才会进入 evalution 模态. 而在"训练(training)模态"与"评估(evalution)模态"下,这两层有不同的表现形式.

模态字典(state_dict)的保存(model是一个网络结构类的对象)

1.1)仅保存学习到的参数,用以下命令

torch.save(model.state_dict(), PATH)

1.2)加载model.state_dict,用以下命令

model = TheModelClass(*args, **kwargs)
 model.load_state_dict(torch.load(PATH))
 model.eval()

备注：model.load_state_dict的操作对象是一个具体的对象,而不能是文件名

2.1)保存整个model的状态,用以下命令

torch.save(model,PATH)

2.2)加载整个model的状态,用以下命令:

# Model class must be defined somewhere

 model = torch.load(PATH)

 model.eval()

state_dict 是一个python的字典格式,以字典的格式存储,然后以字典的格式被加载,而且只加载key匹配的项

如何仅加载某一层的训练的到的参数(某一层的state)

If you want to load parameters from one layer to another, but some keys do not match, simply change the name of the parameter keys in the state_dict that you are loading to match the keys in the model that you are loading into.

conv1_weight_state = torch.load('./model_state_dict.pt')['conv1.weight']

加载模型参数后,如何设置某层某参数的"是否需要训练"(param.requires_grad)

for param in list(model.pretrained.parameters()):
 param.requires_grad = False

注意: requires_grad的操作对象是tensor.

疑问:能否直接对某个层直接之用requires_grad呢?例如:model.conv1.requires_grad=False

回答:经测试,不可以.model.conv1 没有requires_grad属性.

全部测试代码:

#-*-coding:utf-8-*-
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
 
 
 
# define model
class TheModelClass(nn.Module):
 def __init__(self):
  super(TheModelClass,self).__init__()
  self.conv1 = nn.Conv2d(3,6,5)
  self.pool = nn.MaxPool2d(2,2)
  self.conv2 = nn.Conv2d(6,16,5)
  self.fc1 = nn.Linear(16*5*5,120)
  self.fc2 = nn.Linear(120,84)
  self.fc3 = nn.Linear(84,10)
 
 def forward(self,x):
  x = self.pool(F.relu(self.conv1(x)))
  x = self.pool(F.relu(self.conv2(x)))
  x = x.view(-1,16*5*5)
  x = F.relu(self.fc1(x))
  x = F.relu(self.fc2(x))
  x = self.fc3(x)
  return x
 
# initial model
model = TheModelClass()
 
#initialize the optimizer
optimizer = optim.SGD(model.parameters(),lr=0.001,momentum=0.9)
 
# print the model's state_dict
print("model's state_dict:")
for param_tensor in model.state_dict():
 print(param_tensor,'\t',model.state_dict()[param_tensor].size())
 
print("\noptimizer's state_dict")
for var_name in optimizer.state_dict():
 print(var_name,'\t',optimizer.state_dict()[var_name])
 
print("\nprint particular param")
print('\n',model.conv1.weight.size())
print('\n',model.conv1.weight)
 
print("------------------------------------")
torch.save(model.state_dict(),'./model_state_dict.pt')
# model_2 = TheModelClass()
# model_2.load_state_dict(torch.load('./model_state_dict'))
# model.eval()
# print('\n',model_2.conv1.weight)
# print((model_2.conv1.weight == model.conv1.weight).size())
## 仅仅加载某一层的参数
conv1_weight_state = torch.load('./model_state_dict.pt')['conv1.weight']
print(conv1_weight_state==model.conv1.weight)
 
model_2 = TheModelClass()
model_2.load_state_dict(torch.load('./model_state_dict.pt'))
model_2.conv1.requires_grad=False
print(model_2.conv1.requires_grad)
print(model_2.conv1.bias.requires_grad)

以上这篇pytorch 状态字典:state_dict使用详解就是小编分享给大家的全部内容了，希望能给大家一个参考，也希望大家多多支持三水点靠木。

pytorch 状态字典:state_dict使用详解

- Author -

wzg2016

声明：登载此文出于传递更多信息之目的，并不意味着赞同其观点或证实其描述。

Python 相关文章推荐

Python获取文件ssdeep值的方法

Oct 05 Python

Python获取linux主机ip的简单实现方法

Apr 18 Python

python3.5 + PyQt5 +Eric6 实现的一个计算器代码

Mar 11 Python

python 数据的清理行为实例详解

Jul 12 Python

用Python登录好友QQ空间点赞的示例代码

Nov 04 Python

解读! Python在人工智能中的作用

Nov 14 Python

Django基础知识与基本应用入门教程

Jul 20 Python

Python 基于FIR实现Hilbert滤波器求信号包络详解

Feb 26 Python

Python通过4种方式实现进程数据通信

Mar 12 Python

Python 中由 yield 实现异步操作

May 04 Python

python输出结果刷新及进度条的实现操作

Jul 13 Python

python 基于selectors库实现文件上传与下载

Dec 31 Python

Python标准库itertools的使用方法

Jan 17 #Python

Python实现投影法分割图像示例(二)

Jan 17 #Python

Python常用库大全及简要说明

Jan 17 #Python

Python Sphinx使用实例及问题解决

Jan 17 #Python

通过实例了解Python str()和repr()的区别

Jan 17 #Python

python无序链表删除重复项的方法

Jan 17 #Python

Python实现投影法分割图像示例(一)

Jan 17 #Python

You might like

写一个用户在线显示的程序

2006/10/09 PHP

PHP图像处理之使用imagecolorallocate()函数设置颜色例子

2014/11/19 PHP

php pdo连接数据库操作示例

2019/11/18 PHP

PHP强制转化的形式整理

2020/05/22 PHP

给Function做的OOP扩展

2009/05/07 Javascript

IE Firefox 使用自定义标签的区别

2009/10/15 Javascript

中国地区三级联动下拉菜单效果分析

2012/11/15 Javascript

动态的创建一个元素createElement及删除一个元素

2014/01/24 Javascript

javascript上下方向键控制表格行选中并高亮显示的方法

2015/02/13 Javascript

jQuery如何防止这种冒泡事件发生

2015/02/27 Javascript

js中this用法实例详解

2015/05/05 Javascript

vue.js入门教程之基础语法小结

2016/09/01 Javascript

jquery自定义表单验证插件

2016/10/12 Javascript

AngularJs表单校验功能实例代码

2017/02/09 Javascript

使用ionic播放轮询广告的实现方法(必看)

2017/04/24 Javascript

BootStrap 表单控件之单选按钮水平排列

2017/05/23 Javascript

webpack+vue+express(hot)热启动调试简单配置方法

2018/09/19 Javascript

JS实现处理时间，年月日，星期的公共方法示例

2019/05/31 Javascript

刷新页面后让控制台的js代码继续执行

2019/09/20 Javascript

jQuery 选择器用法基础入门示例

2020/01/04 jQuery

[01:32]TI珍贵瞬间系列（一）

2020/08/26 DOTA

python实现电子词典

2020/04/23 Python

Python文件的读写和异常代码示例

2017/10/31 Python

对python过滤器和lambda函数的用法详解

2019/01/21 Python

Python参数传递实现过程及原理详解

2020/05/14 Python

英国标志性奢侈品牌：Burberry

2016/07/28 全球购物

遗体告别仪式答谢词

2014/01/23 职场文书

社区母亲节活动方案

2014/03/05 职场文书

幼儿园大班毕业教师寄语

2014/04/03 职场文书

新闻编辑专业自荐信

2014/07/02 职场文书

学校政风行风评议工作总结

2014/10/21 职场文书

政审证明材料

2015/06/19 职场文书

保外就医申请书范文

2015/08/06 职场文书

2016年第32个教师节致辞

2015/11/26 职场文书

vue ref如何获取子组件属性值

2022/03/31 Vue.js

MySQL约束(创建表时的各种条件说明)

2022/06/21 MySQL