Convolutional Highway 神经网络
根据的介绍,Highway神经网络除了全连接层版本之外,还有一个卷积版本。
网上能找到的大多是全连接层版本的实现。其实卷积版本也非常简单。
代码如下:
import torch import torch.nn as nn import torch.nn.functional as F class ConvHighWay(nn.Module): """ y = f(x)的一层非线性变换,具体公式为 y = T(x, Wt) * x + (1 - T(x, Wt)) * H(x, Wh) 与普通highway不同之处在于,这里用卷积层替代全连接层。 相应的,输入x的维度应该是(B,C,W,H) 参考文档 https://arxiv.org/abs/1505.00387 """ def __init__(self, in_channel, n_layers=1, activation_fn=F.relu): super(ConvHighWay, self).__init__() self.activation_fn = activation_fn self.n_layers = n_layers # kernel_size 和 padding 必须慎重填写,否则卷积输出维度和输入维度就不同了 self.Wh = nn.ModuleList([nn.Conv2d(in_channels=in_channel, out_channels=in_channel, kernel_size=3, padding=1) for _ in range(n_layers)]) self.Wt = nn.ModuleList([nn.Conv2d(in_channels=in_channel, out_channels=in_channel, kernel_size=3, padding=1) for _ in range(n_layers)]) # 为了使神经网络更多地偏向于y = x,把bt设置为正数,使得sigmoid(Wt * x + bt)接近于1 for layer in self.Wt: layer.bias.data.fill_(1) def forward(self, x: torch.Tensor) -> torch.Tensor: for layer_i in range(self.n_layers): # H(x, Wh) nonlinear_part = self.activation_fn(self.Wh[layer_i](x)) # T(x, Wt) gate = torch.sigmoid(self.Wt[layer_i](x)) # T(x, Wt) * x + (1 - T(x, Wt)) * H(x, Wh) x = gate * x + (1 - gate) * nonlinear_part return x if __name__ == "__main__": channel = 3 highway = ConvHighWay(channel, n_layers=2) x = torch.rand((2, channel, 10, 10)) print(x.size()) y = highway(x) print(y.size())
上一篇:
Java架构师技术进阶路线图
下一篇:
计算机网络和信息安全-网络安全