卷积自动编码器


卷积自动编码器

如果要处理图像,目前为止的自动编码器都无法很好的工作(除非图像非常小),卷积神经网络比密集网络更适合处理图像。如果要为图像构建自动编码器(例如,用于无监督预训练或降维),则需要构建卷积自动编码器。编码器是由卷积层和池化层组成的常规CNN。它通常会减小输入的空间尺寸(即高度和宽度),同时会增加深度(即特征图的数量)。解码器必须进行相反的操作(放大图像并减少其深度到原始尺寸),为此可以使用转置卷积层(或者可以将上采样层与卷积层组合在一起)

以下构建适用于Fashion MNIST的简单卷积自动编码器:

from tensorflow import keras
import tensorflow as tf

fashion_mnist = keras.datasets.fashion_mnist
(X_train_all, y_train_all), (X_test, y_test) = fashion_mnist.load_data()
X_valid, X_train = X_train_all[:5000] / 255., X_train_all[5000:] / 255.
y_valid, y_train = y_train_all[:5000], y_train_all[5000:]
conv_encoder = keras.models.Sequential([
    keras.layers.Reshape([28, 28, 1], input_shape=[28, 28]),
    keras.layers.Conv2D(16, kernel_size=3, padding='same', activation='gelu'),
    keras.layers.MaxPool2D(pool_size=2),
    keras.layers.Conv2D(32, kernel_size=3, padding='same', activation='gelu'),
    keras.layers.MaxPool2D(pool_size=2),
    keras.layers.Conv2D(64, kernel_size=3, padding='same', activation='gelu'),
    keras.layers.MaxPool2D(pool_size=2)
])
conv_decoder = keras.models.Sequential([
    keras.layers.Conv2DTranspose(32, kernel_size=3, strides=2, padding='valid', activation='gelu'),
    keras.layers.Conv2DTranspose(16, kernel_size=3, strides=2, padding='same', activation='gelu'),
    keras.layers.Conv2DTranspose(1, kernel_size=3, strides=2, padding='same', activation='sigmoid'),
    keras.layers.Reshape([28, 28])
])
conv_ae = keras.models.Sequential([conv_encoder, conv_decoder])
conv_ae.compile(loss='binary_crossentropy', optimizer=keras.optimizers.Adam())
history = conv_ae.fit(X_train, X_train, epochs=10, validation_data=(X_valid, X_valid), batch_size=32)
Epoch 1/10
1719/1719 [==============================] - 12s 7ms/step - loss: 0.3013 - val_loss: 0.2745
Epoch 2/10
1719/1719 [==============================] - 11s 7ms/step - loss: 0.2734 - val_loss: 0.2672
Epoch 3/10
1719/1719 [==============================] - 11s 6ms/step - loss: 0.2684 - val_loss: 0.2637
Epoch 4/10
1719/1719 [==============================] - 11s 6ms/step - loss: 0.2655 - val_loss: 0.2614
Epoch 5/10
1719/1719 [==============================] - 11s 6ms/step - loss: 0.2636 - val_loss: 0.2597
Epoch 6/10
1719/1719 [==============================] - 11s 6ms/step - loss: 0.2623 - val_loss: 0.2588
Epoch 7/10
1719/1719 [==============================] - 11s 7ms/step - loss: 0.2613 - val_loss: 0.2577
Epoch 8/10
1719/1719 [==============================] - 12s 7ms/step - loss: 0.2605 - val_loss: 0.2572
Epoch 9/10
1719/1719 [==============================] - 12s 7ms/step - loss: 0.2599 - val_loss: 0.2567
Epoch 10/10
1719/1719 [==============================] - 11s 7ms/step - loss: 0.2593 - val_loss: 0.2563

可视化重构

import matplotlib.pyplot as plt


def plot_image(image):
    plt.imshow(image, cmap='binary')
    plt.axis('off')


def show_reconstructions(model, n_images=5):
    reconstructions = model.predict(X_valid[:n_images])
    fig = plt.figure(figsize=(n_images * 1.5, 3))
    for image_index in range(n_images):
        plt.subplot(2, n_images, 1 + image_index)
        plot_image(X_valid[image_index])
        plt.subplot(2, n_images, 1 + n_images + image_index)
        plot_image(reconstructions[image_index])


show_reconstructions(conv_ae)

?

相关