深度学习之语义分割-SegNet

深度学习之语义分割-SegNet 2022-08-20 289

This core trainable segmentation engine consists of an encoder network, a corresponding decoder network followed by a pixel-wise classification layer.

模型

说明：基础模型采用VGG16 去掉fc层，使得encoder网络更小，更易训练 134M –> 14.7M decoder网络将encoder网络中的低层像素映射到整张图像尺寸 decoder网络与encoder网络基本完全对成最终，对每一个像素进行multi-class soft-max分类 decoder网络进行上采样的采用pool indices 没有增加参数：稀疏，且不需要参与训练降低内存：decoder不用存储encoder中的输出结果提升了边界的描绘能力该网络结构可以拓展到任意的encoder-decoder网络结构效果：注意 Max-pooling可以提高平移不变形： Max-pooling is used to achieve translation invariance over small spatial shifts in the input image. 降采样是的feature层上的每一个点对应原图上一块很大的区域 Sub-sampling results in a large input image context (spatial window) for each pixel in the feature map. 多次的max-pooling和降采样虽然能够提高模型的分类能力，但是缺丢失了图像中的空间边界能力 While several layers of max-pooling and sub-sampling can achieve more translation invariance for robust classification correspondingly there is loss of spatial resolution of the feature maps. 特征层上的空间位置关系对于分割任务非常重要 The increasingly lossy (boundary detail) image representation is not beneficial for segmentation where boundary delineation is vital

效果分析

实验1:

对比不同的decoder模型

说明采用双线性插值进行上采样，固定参数【不参与学习】采用max-pooling indices进行上采样【不参与学习】采用双线性插值进行上采样，参数参与学习双线性插值进行初始化 SegNet-Basic采用类似FCN的decoder方式 4个encoder，4个decoder upsample上采样采用下采样downsample的indices encoder／decoder上，每一个conv之后接一个BN操作。对于decoder网络，conv中没有采用ReLU非线性激活函数和biases偏置采用7x7卷积核，则VGG layer4的感受野为106x106 decoder卷积的filter个数与对一个的encoder卷积filter个数相同 SegNet-SingleChannelDecoder decoder卷积核个数位1 FCN-Basic-NoDimReduction 最终的维度和对应的encoder相对应结论还是FCN-Basic-NoDimReduction的效果最好

总结

该论文提出了一种encoder-decoder的分割方法，相比较FCN，该方法采用了max-pooling indices进行上采样，有效的降低了upsample中的内存使用问题。

参考文献

免费搭建微信查券返利机器人来轻松赚佣金

文章来自:IT技术分享网
分享地址:http://www.5ityx.cn/cate112/118012.html

上一篇： JS实现多线程数据分片下载

下一篇：如何用Ucinet生成网络结构图，只有excel中的原始数据

深度学习之语义分割-SegNet

模型

效果分析

实验1:

总结

参考文献

深度学习之语义分割-SegNet 相关内容

聚合标签