Notice

Recent Posts

Recent Comments

Link

깃허브

« 2026/06 »
일	월	화	수	목	금	토
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30

Tags more

Archives

Today

Total

관리 메뉴

수달이네 기술 블로그

2. AutoEncoder(오토인코더) 본문

AI공부/Vision

2. AutoEncoder(오토인코더)

슬픈 수달이 2026. 3. 10. 17:05

오토인코더(Autoencoder)

입력 데이터를 효율적으로 압축하고, 다시 복원하는 것을 목표로 하는 인공신경망 기반의 비지도 학습 모델.

인코더(Encoder)

인코더의 신경망 구조를 통해 고차원 데이터를 저차원(latent space)의 잠재 표현(latent vector)으로 변환

입력 이미지가 28x28의 픽셀일 때, 784개의 숫자를 입력으로 받음.
이후 해당 숫자에서 특징을 뽑아 32차원의 벡터로 압축 요약.
(공간: latent space, 잠재표현: latent vector)

디코더(Decoder)

인코더가 만든 요약 정보를 통해 원래 데이터와 최대한 비슷하게 복원

인코더가 만든 32차원의 벡터를 다시 784개의 숫자로 복원(28x28)
비슷하게 재연하는 과정에서의 변화로 이미지를 생성.

MNIST데이터셋에 autoencoder적용(압축, 복원)

위와 같이 MNIST데이터셋이 입력되면 다시 디코더가 복원하는 모델을 만들것이다.

from tensorflow.keras.datasets import mnist
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, UpSampling2D
import matplotlib.pyplot as plt
import numpy as np

데이터셋 다운운

# 1) 데이터 로드 & 전처리
(X_train, _), (X_test, _) = mnist.load_data()
X_train = X_train.reshape(-1, 28, 28, 1).astype('float32') / 255.0
X_test  = X_test.reshape(-1, 28, 28, 1).astype('float32') / 255.0

mnist데이터셋을 다운로드 받은 후 28x28 크기로 전처리 해준다.

오토인코더 구성

# 2) 오토인코더 구성 (인코더 2회 다운샘플 ↔ 디코더 2회 업샘플 대칭)
autoencoder = Sequential()

# Encoder: 28x28x1 -> 14x14x16 -> 7x7x8
autoencoder.add(Conv2D(16, kernel_size=3, padding='same', activation='relu', input_shape=(28, 28, 1)))
autoencoder.add(MaxPooling2D(pool_size=2, padding='same'))
autoencoder.add(Conv2D(8, kernel_size=3, padding='same', activation='relu'))
autoencoder.add(MaxPooling2D(pool_size=2, padding='same'))

# Decoder: 7x7x8 -> 14x14x8 -> 28x28x16 -> 28x28x1
autoencoder.add(Conv2D(8, kernel_size=3, padding='same', activation='relu'))
autoencoder.add(UpSampling2D())
autoencoder.add(Conv2D(16, kernel_size=3, padding='same', activation='relu'))  # padding='same' 추가
autoencoder.add(UpSampling2D())
autoencoder.add(Conv2D(1, kernel_size=3, padding='same', activation='sigmoid'))

# 3) 구조 확인
autoencoder.summary()

오토인코더의 모델 구조를 만들어준다.
인코더
- Conv2D연산을 통해 특징을 추출, 풀링을 통해 압축
- 여기서 Conv연산은 padding을 통해 크기가 변하지 않는다.

여기서 CNN과 오토인코더의 차이가 와 닿지 않았기에 알아보니, 오토인코더와 CNN은 conv연산, 풀링 계층이 있는 것은 비슷하지만.
- CNN: 위에서 얻은 정보를 완전 연결 계층(Fully Connected Layer)를 통해 라벨 예측 + CrossEntropyLoss를 통해 분류 정확도를 중심으로 학습
- AE: 특징맵을 Decoder가 입력을 복원할 수 있도록 학습한다.+ MSE, BCE등을 통해 입출력 간의 차이를 최소화 하는 방향

디코더
- Upsampling연산으로 공간적 해상도를 늘리고, Conv2D를 통해 세부 특징을 채워 넣음.

학습

autoencoder.compile(optimizer='adam', loss='binary_crossentropy')
history = autoencoder.fit(
    X_train, X_train,
    epochs=50, batch_size=128,
    validation_data=(X_test, X_test),
    verbose=1
)

# Epoch 49/50
# 469/469 ━━━━━━━━━━━━━━━━━━━━ 10s 20ms/step - loss: 0.0693 - val_loss: 0.0688
# Epoch 50/50
# 469/469 ━━━━━━━━━━━━━━━━━━━━ 9s 19ms/step - loss: 0.0693 - val_loss: 0.0687

옵티마이저: adam, loss function: BCE

결과

random_idx = np.random.randint(X_test.shape[0], size=5)
recons = autoencoder.predict(X_test)

plt.figure(figsize=(10, 4))  # 2x5
for i, idx in enumerate(random_idx):
    # 원본
    ax = plt.subplot(2, 5, i + 1)
    plt.imshow(X_test[idx].squeeze(), cmap='gray')
    ax.axis('off')
    if i == 0:
        ax.set_title("Original")

    # 복원
    ax = plt.subplot(2, 5, 5 + i + 1)
    plt.imshow(recons[idx].squeeze(), cmap='gray')
    ax.axis('off')
    if i == 0:
        ax.set_title("Reconstructed")
plt.tight_layout()
plt.show()

랜덤으로 테스트 데이터에서 하나를 뽑아 복원하게 만든 결과를 보면 꽤나 잘 만든 것을 알 수 있다.

Auto Encoder는 생성 모델이 아니므로 사실상 Latent space에서 Latent vector가 없는 곳을 찍어도 제대로된 결과를 볼 수 없다.
그러나 손상된 이미지를 디코더를 통해 복원시키는 것이 가능하다.

'AI공부 > Vision' 카테고리의 다른 글

3. AutoEncoder2(Sparse AE, Denoising AE, AutoEncoder + CIFAR10) (0)	2026.03.11
1. ViT(Vision Transformer (0)	2026.03.09

'AI공부/Vision' Related Articles