编程 Python

Tensorflow使用tfrecord输入数据格式

Posted in Python onJune 19, 2018

Tensorflow 提供了一种统一的格式来存储数据，这个格式就是TFRecord,上一篇文章中所提到的方法当数据的来源更复杂，每个样例中的信息更丰富的时候就很难有效的记录输入数据中的信息了，于是Tensorflow提供了TFRecord来统一存储数据，接下来我们就来介绍如何使用TFRecord来同意输入数据的格式。

1. TFRecord格式介绍

TFRecord文件中的数据是通过tf.train.Example Protocol Buffer的格式存储的，下面是tf.train.Example的定义

message Example {
 Features features = 1;
};

message Features{
 map<string,Feature> featrue = 1;
};

message Feature{
  oneof kind{
    BytesList bytes_list = 1;
    FloatList float_list = 2;
    Int64List int64_list = 3;
  }
};

从上述代码可以看到，ft.train.Example 的数据结构相对简洁。tf.train.Example中包含了一个从属性名称到取值的字典，其中属性名称为一个字符串，属性的取值可以为字符串（BytesList ），实数列表（FloatList ）或整数列表（Int64List ）。例如我们可以将解码前的图片作为字符串，图像对应的类别标号作为整数列表。

2. 将自己的数据转化为TFRecord格式

准备数据

在上一篇中，我们为了像伟大的MNIST致敬，所以选择图像的前缀来进行不同类别的分类依据，但是大多数的情况下，在进行分类任务的过程中，不同的类别都会放在不同的文件夹下，而且类别的个数往往浮动性又很大，所以针对这样的情况，我们现在利用不同类别在不同文件夹中的图像来生成TFRecord.

我们在Iris&Contact这个文件夹下有两个文件夹，分别为iris,contact。对于每个文件夹中存放的是对应的图片

转换数据

数据准备好以后，就开始准备生成TFRecord,具体代码如下：

import os 
import tensorflow as tf 
from PIL import Image 
import matplotlib.pyplot as plt 

cwd='/home/ruyiwei/Documents/Iris&Contact/'
classes={'iris','contact'} 
writer= tf.python_io.TFRecordWriter("iris_contact.tfrecords") 

for index,name in enumerate(classes):
  class_path=cwd+name+'/'
  for img_name in os.listdir(class_path): 
    img_path=class_path+img_name 
    img=Image.open(img_path)
    img= img.resize((512,80))
    img_raw=img.tobytes()
    #plt.imshow(img) # if you want to check you image,please delete '#'
    #plt.show()
    example = tf.train.Example(features=tf.train.Features(feature={
      "label": tf.train.Feature(int64_list=tf.train.Int64List(value=[index])),
      'img_raw': tf.train.Feature(bytes_list=tf.train.BytesList(value=[img_raw]))
    })) 
    writer.write(example.SerializeToString()) 

writer.close()

3. Tensorflow从TFRecord中读取数据

def read_and_decode(filename): # read iris_contact.tfrecords
  filename_queue = tf.train.string_input_producer([filename])# create a queue

  reader = tf.TFRecordReader()
  _, serialized_example = reader.read(filename_queue)#return file_name and file
  features = tf.parse_single_example(serialized_example,
                    features={
                      'label': tf.FixedLenFeature([], tf.int64),
                      'img_raw' : tf.FixedLenFeature([], tf.string),
                    })#return image and label

  img = tf.decode_raw(features['img_raw'], tf.uint8)
  img = tf.reshape(img, [512, 80, 3]) #reshape image to 512*80*3
  img = tf.cast(img, tf.float32) * (1. / 255) - 0.5 #throw img tensor
  label = tf.cast(features['label'], tf.int32) #throw label tensor
  return img, label

4. 将TFRecord中的数据保存为图片

filename_queue = tf.train.string_input_producer(["iris_contact.tfrecords"]) 
reader = tf.TFRecordReader()
_, serialized_example = reader.read(filename_queue)  #return file and file_name
features = tf.parse_single_example(serialized_example,
                  features={
                    'label': tf.FixedLenFeature([], tf.int64),
                    'img_raw' : tf.FixedLenFeature([], tf.string),
                  }) 
image = tf.decode_raw(features['img_raw'], tf.uint8)
image = tf.reshape(image, [512, 80, 3])
label = tf.cast(features['label'], tf.int32)
with tf.Session() as sess: 
  init_op = tf.initialize_all_variables()
  sess.run(init_op)
  coord=tf.train.Coordinator()
  threads= tf.train.start_queue_runners(coord=coord)
  for i in range(20):
    example, l = sess.run([image,label])#take out image and label
    img=Image.fromarray(example, 'RGB')
    img.save(cwd+str(i)+'_''Label_'+str(l)+'.jpg')#save image
    print(example, l)
  coord.request_stop()
  coord.join(threads)

以上就是本文的全部内容，希望对大家的学习有所帮助，也希望大家多多支持三水点靠木。

Tensorflow使用tfrecord输入数据格式

- Author -

ruyiweicas

声明：登载此文出于传递更多信息之目的，并不意味着赞同其观点或证实其描述。

Python 相关文章推荐

Python遍历目录中的所有文件的方法

Jul 08 Python

使用PyInstaller将Python程序文件转换为可执行程序文件

Jul 08 Python

windows下Python实现将pdf文件转化为png格式图片的方法

Jul 21 Python

Python3多线程爬虫实例讲解代码

Jan 05 Python

Python 3.6 读取并操作文件内容的实例

Apr 23 Python

selenium3+python3环境搭建教程图解

Dec 07 Python

在Python中Dataframe通过print输出多行时显示省略号的实例

Dec 22 Python

在python中,使用scatter绘制散点图的实例

Jul 03 Python

如何使用Flask-Migrate拓展数据库表结构

Jul 24 Python

PyQt5使用QTimer实现电子时钟

Jul 29 Python

Python datetime 格式化明天,昨天实例

Mar 02 Python

Python word文本自动化操作实现方法解析

Nov 05 Python

Tensorflow 训练自己的数据集将数据直接导入到内存

Jun 19 #Python

python如何爬取个性签名

Jun 19 #Python

详解TensorFlow查看ckpt中变量的几种方法

Jun 19 #Python

TensorFlow 滑动平均的示例代码

Jun 19 #Python

python3个性签名设计实现代码

Jun 19 #Python

TensorFlow 模型载入方法汇总(小结)

Jun 19 #Python

python3爬虫之设计签名小程序

Jun 19 #Python

You might like

php自动更新版权信息显示的方法

2015/06/19 PHP

通过Email发送PHP错误的方法

2015/07/20 PHP

thinkphp利用模型通用数据编辑添加和删除的实例代码

2016/11/20 PHP

CodeIgniter整合Smarty的方法详解

2017/08/25 PHP

Google Suggest ;-) 基于js的动态下拉菜单

2006/10/11 Javascript

jquery 提交值不为空的元素示例代码

2013/05/10 Javascript

JQuery控制Radio选中方法分析

2015/05/29 Javascript

JavaScript常用数组算法小结

2016/02/13 Javascript

AngularJs学习第八篇过滤器filter创建

2016/06/08 Javascript

微信开发微信授权详解

2016/10/21 Javascript

jQuery删除当前节点元素

2016/12/07 Javascript

AngularJS通过ng-route实现基本的路由功能实例详解

2016/12/13 Javascript

详解node服务器中打开html文件的两种方法

2017/09/18 Javascript

Angular搜索场景中使用rxjs的操作符处理思路

2018/05/30 Javascript

vue.js自定义组件directives的实例代码

2018/11/09 Javascript

微信小程序API—获取定位的详解

2019/04/30 Javascript

vue实现评论列表功能

2019/10/25 Javascript

Vue.js暴露方法给WebView的使用操作

2020/09/07 Javascript

js+canvas实现画板功能

2020/09/13 Javascript

[02:17]TI4西雅图DOTA2前线报道啸天mik夫妻档解说

2014/07/08 DOTA

OpenCV2.3.1+Python2.7.3+Numpy等的配置解析

2018/01/05 Python

python使用opencv按一定间隔截取视频帧

2018/03/06 Python

python实现合并两个排序的链表

2019/03/03 Python

Django分组聚合查询实例分享

2020/04/29 Python

HTML5的结构和语义(1):前言

2008/10/17 HTML / CSS

伦敦一家西班牙童装精品店：La Coqueta

2018/02/02 全球购物

台湾时尚彩瞳专门店：imeime

2019/08/16 全球购物

小学门卫岗位职责

2013/12/17 职场文书

好军嫂事迹材料

2014/01/15 职场文书

挂职自我鉴定

2014/02/26 职场文书

改进作风怎么办发言材料

2014/08/17 职场文书

高三毕业评语

2014/12/31 职场文书

留学推荐信英文范文

2015/03/26 职场文书

2016优秀青年志愿者事迹材料

2016/02/25 职场文书

2019关于实习生工作安排及待遇的管理方案！

2019/07/16 职场文书

python中validators库的使用方法详解

2022/09/23 Python