How to change architecture of conditional GAN to generate 224x224x3 images?

Question

Alok le 4 Août 2022

0
Lien

Utiliser le lien direct vers cette question

https://fr.mathworks.com/matlabcentral/answers/1773905-how-to-change-architecture-of-conditional-gan-to-generate-224x224x3-images

Réponse apportée : Ayush Aniket le 9 Mai 2025

I am following matlab example on conditional GAN at https://www.mathworks.com/help/deeplearning/ug/train-conditional-generative-adversarial-network.html

This example is for image size 64x64x3. I am wondering what changes should be done in layersGenerator and layersDiscriminator to generate 224x224x3 images.

This is my code:

inputSize = [224 224 3] or [256 256 3];

Note if Factor=2 (below) then I get image size 128x128x3. If Factor=4 then generated image size if 256x256x3. However, during the loop, it gives an error that trainedVariance is negative.

inputSize = [64 64 3];
Factor = 4; %if Factor =2 then 128x128x3 image size is generated; 
inputSize = Factor*inputSize(1:2);
numClasses = 2;
augimds = augmentedImageDatastore(inputSize(1:2),XTrain,YTrain);
augimdsValidation = augmentedImageDatastore(inputSize(1:2),XValidation,YValidation);
numLatentInputs = 100;%100
embeddingDimension = 50;
numFilters = Factor*64;%224;
filterSize = 5;
projectionSize = Factor*[4 4 1024];
layersGenerator = [
    featureInputLayer(numLatentInputs)
    fullyConnectedLayer(prod(projectionSize))
    functionLayer(@(X) feature2image(X,projectionSize),Formattable=true)
    concatenationLayer(3,2,Name="cat");
    transposedConv2dLayer(filterSize,4*numFilters,Stride=2,Cropping="same")
    batchNormalizationLayer
    reluLayer
    transposedConv2dLayer(filterSize,2*numFilters,Stride=2,Cropping="same")
    batchNormalizationLayer
    reluLayer
    transposedConv2dLayer(filterSize,numFilters,Stride=2,Cropping="same")
    batchNormalizationLayer
    reluLayer
    transposedConv2dLayer(filterSize,3,Stride=2,Cropping="same")
    tanhLayer];
lgraphGenerator = layerGraph(layersGenerator);
layers = [
    featureInputLayer(1)
    embeddingLayer(embeddingDimension,numClasses)
    fullyConnectedLayer(prod(projectionSize(1:2)))
    functionLayer(@(X) feature2image(X,[projectionSize(1:2) 1]),Formattable=true,Name="emb_reshape")];
lgraphGenerator = addLayers(lgraphGenerator,layers);
lgraphGenerator = connectLayers(lgraphGenerator,"emb_reshape","cat/in2");
netG = dlnetwork(lgraphGenerator);
dropoutProb = 0.75;
%numFilters = 64;
scale = 0.2;
filterSize = 5;
layersDiscriminator = [
    imageInputLayer(inputSize,Normalization="none")
    dropoutLayer(dropoutProb)
    concatenationLayer(3,2,Name="cat")
    convolution2dLayer(filterSize,numFilters,Stride=2,Padding="same")
    leakyReluLayer(scale)
    convolution2dLayer(filterSize,2*numFilters,Stride=2,Padding="same")
    batchNormalizationLayer
    leakyReluLayer(scale)
    convolution2dLayer(filterSize,4*numFilters,Stride=2,Padding="same")
    batchNormalizationLayer
    leakyReluLayer(scale)
    convolution2dLayer(filterSize,8*numFilters,Stride=2,Padding="same")
    batchNormalizationLayer
    leakyReluLayer(scale)
    convolution2dLayer(Factor*4,1)];
lgraphDiscriminator = layerGraph(layersDiscriminator);
layers = [
    featureInputLayer(1)
    embeddingLayer(embeddingDimension,numClasses)
    fullyConnectedLayer(prod(inputSize(1:2)))
    functionLayer(@(X) feature2image(X,[inputSize(1:2) 1]),Formattable=true,Name="emb_reshape")];
lgraphDiscriminator = addLayers(lgraphDiscriminator,layers);
lgraphDiscriminator = connectLayers(lgraphDiscriminator,"emb_reshape","cat/in2");
netD = dlnetwork(lgraphDiscriminator);

However, the above code gives an error at

[~,~,gradientsG,gradientsD,stateG,scoreG,scoreD] = ...
            dlfeval(@modelLoss2,netG,netD,X,T,Z,flipFactor);

The size of generated image at

[XGenerated,stageG] = forward(netG,Z,T);

is 256x256x3. However, an error comes stating that trainedVariance is not positive

Could you assist me which transposedConv2dLayer to change to adjust the size to 224x224x3 or 256x256x3?

Thanks for your help

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Connectez-vous pour commenter.

Connectez-vous pour répondre à cette question.

Answer 1

Ayush Aniket le 9 Mai 2025

0
Lien

Utiliser le lien direct vers cette réponse

https://fr.mathworks.com/matlabcentral/answers/1773905-how-to-change-architecture-of-conditional-gan-to-generate-224x224x3-images#answer_1564956

If you use a projection size that doesn't align with the upsampling path, the generator output won't match the expected image size, which can cause downstream errors (such as negative variance or shape mismatches).

Each transposedConv2dLayer with Stride=2 doubles the spatial resolution. The number of upsampling layers and the initial projection size must align so that after all upsampling, you reach your desired output size.The general rule is that if your initial projection size is [h, w, c] and you have n upsampling layers (each with Stride=2), your output size will be [h*2^n, w*2^n, outputChannels]. Therefore, for

1. 256x256x3 Output -

Start with: [4, 4, ...] projection size
Number of upsampling layers: 4
Calculation: 4 → 8 → 16 → 32 → 64 → 128 → 256 (for 6 layers, but typically 4 layers from 4 to 64, then up to 256)
But: 4 upsampling layers from [4,4] gives [64,64]`\, so you need 6 layers to go from 4 to 256.
However, your code uses 4 upsampling layers, so your projection should be [16,16, ...] for 256x256 output: 16 → 32 → 64 → 128 → 256 (4 layers, 16*2^4 = 256)

2. 224x224x3 Output -

224 is not a power of 2, so you need to start with a projection size that, after upsampling, results in 224.
224 = 14 * 2^4
So, start with [14,14, ...] and 4 upsampling layers.

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Connectez-vous pour commenter.

How to change architecture of conditional GAN to generate 224x224x3 images?

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Réponses (1)

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Voir également

Catégories

Tags

Produits

Version

Community Treasure Hunt

How to change architecture of conditional GAN to generate 224x224x3 images?

0 commentaires Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Réponses (1)

0 commentaires Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

Voir également

Catégories

Tags

Produits

Version

Community Treasure Hunt

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens

0 commentaires
Afficher -2 commentaires plus anciensMasquer -2 commentaires plus anciens