Vqgan by arthurdrake1 · Pull Request #49 · IDEALLab/EngiOpt

arthurdrake1 · 2025-09-30T16:58:25Z

Adds the VQGAN training and evaluation with some tuned hyperparameters and using an Online Clustered Codebook to promote 100% codebook utilization.

Also includes three new functions in transforms.py that could be useful for most/all EngiOpt models:

resize_to provides a standardized way of resizing tensors, i.e., to 128x128 prior to model training and vice versa
normalize to normalize the conditions for zero mean and unit std
drop_constant to remove any condition columns that remain constant throughout the data (i.e., overhang_constrant = 0 for all of beams2d)

The latter two are optional and can be disabled in the args for VQGAN.

ffelten

Hey @arthurdrake1, looks very solid, I am honestly impressed by the complexity of the model :).

My comments are essentially some small improvements. I however have a more "meta comment" to make:
I believe the single file implementation is not going to do well in this case. The file iscurrently too big to navigate. I have two suggestions to make the code easier to understand:

put all the utils, network components, gpt (things you use in the algorithm but are not the core logic of the algorithm itself) in their own file. I'd keep the networks definitions (Encoder, Decoder, etc.) in the training script, though.
have the different stages in different files / scripts. Currently, the script is doing a lot of stuff and instantiating a lot of pieces. I believe it might be easier to get if we have different with well defined input and output.

What do you think? :)

ffelten · 2025-10-16T12:29:21Z

engiopt/vqgan/evaluate_vqgan.py

+
+    # Restores the pytorch model from wandb
+    if args.wandb_entity is not None:
+        artifact_path_0 = f"{args.wandb_entity}/{args.wandb_project}/{args.problem_id}_vqgan_cvqgan:seed_{seed}"


I believe having more explicit names than 0, 1, 2, e.g., artifact_transformer might help keeping track of these

Ok I just used 0, 1, 2 for an easier shorthand while coding, but agree spelling them out will help understanding

ffelten · 2025-10-16T12:32:14Z

engiopt/vqgan/evaluate_vqgan.py

+    )
+
+    # Clean up conditions based on model training settings and convert back to tensor
+    sampled_conditions_new = sampled_conditions.select(range(len(sampled_conditions)))


I don't get what this does

This copies over the original sampled_conditions into a new dataset that can then be normalized/cleaned prior to feeding into the model. The reason we need this copy is (to the best of my knowledge) the metrics.metrics() call later requires the original conditions for its calculations.

Fair! The name is a bit cryptic then, though.

ffelten · 2025-10-16T12:35:16Z

.pre-commit-config.yaml

        entry: pyright
        language: node
-        pass_filenames: false
+        pass_filenames: true


Why is this changed?

It was for a debug earlier, I will change it back

pyproject.toml

ffelten · 2025-10-16T12:35:56Z

pyproject.toml

 pythonVersion = "3.9"
 pythonPlatform = "All"
-typeshedPath = "typeshed"
+# typeshedPath = "typeshed" -> commented out may lead to precommit out of memory error


Same, why this change?

Also for pre-commit debug, I will undo this

ffelten · 2025-10-17T08:13:22Z

engiopt/vqgan/vqgan.py

+
+    def __init__(self):
+        super().__init__()
+        self.register_buffer("shift", th.tensor([-0.030, -0.088, -0.188])[None, :, None, None])


These constants are coming from LPIPS?

Yes, they are just constants to normalize the input for the pretrained VGG model

ffelten · 2025-10-17T08:14:32Z

engiopt/vqgan/vqgan.py

+
+
+###########################################
+########## GPT-2 BASE CODE BELOW ##########


I'd put this into its own file

engiopt/vqgan/vqgan.py

ffelten · 2025-10-17T08:18:27Z

engiopt/vqgan/vqgan.py

+        th.as_tensor(training_ds["optimal_upsampled"][:]).to(device),
+        *[th.as_tensor(training_ds[key][:]).to(device) for key in conditions],
+    )
+    dataloader_0 = th.utils.data.DataLoader(


Why 3 times the same thing?

The different model stages could have different batch sizes

ffelten · 2025-10-17T08:19:42Z

engiopt/vqgan/vqgan.py

+            name=run_name,
+            dir="./logs/wandb",
+        )
+        wandb.define_metric("0_step", summary="max")


Do we need these lines?

Yes in this special multi-stage training case, to have separate training step counts for each one.

arthurdrake1 · 2025-10-17T16:44:23Z

@ffelten Going to make the minor edits now. I will also move all the "helper modules" like nanogpt into their own file. As for separating out the different training phases, this would be more complicated as the current logic is built on the single script implementation, i.e., easily reusing the trained vqgan and cvqgan to then train the transformer.

ffelten

Some minor stuff. I did not read all the details but it looks clean enough to me!

ffelten · 2025-10-21T13:06:12Z

engiopt/vqgan/vqgan.py

+    cond_lr: float = 2e-4  # Default: 2e-4
+    """learning rate for CVQGAN"""
+    latent_size: int = 16
+    """size of the latent feature map (automatically determined later)"""


Why is this determined later? Also, is this "automated determination" reflected in the wandb config (hyperparams) ?

The latent feature map dimension is dictated by the number of encoder and decoder layers, so in practice the user has to increase or decrease those to change latent_size. Number of image channels is more straightforward, we just update that based on the number of channels in the data (always 1 in our case, for now).

I checked the wandb config for a test run I did just now with bogus values (--latent_size 123123 and --image_channels 123123). They remained at 16 and 1 respectively in the hyperparams list on wandb.

@ffelten Upon further review of the code, I don't think these args are needed. They're just calculated as normal variables and override any actual user inputs as I determined above. Should be good to go now.

ffelten · 2025-10-21T13:06:30Z

engiopt/vqgan/vqgan.py

+    """number of epochs of training"""
+    batch_size_vqgan: int = 16
+    """size of the batches for Stage 1"""
+    lr_vqgan: float = 5e-5  # Default: 2e-4


I'm not sure you want to keep the default comment? There are some others for different hyperparams, too

Yes thanks, this was for debugging so I removed it now

arthurdrake1 added 16 commits September 23, 2025 13:07

add initial vqgan files

482c85d

add windows support and add vqgan description

be14a17

add vqgan args and some new image transform functions

1398f41

add stage 1 submodules

e334900

clean up stage 1 modules

8df08db

add remaining cleaned up stage 1 modules

e415de7

add vqgan stage 1 main class and resolve arg names

97efc89

add vqgan stage 2 modules

f9d565d

add initial training loops and conditions augmentations

5b9625f

Merge branch 'main' into vqgan

2be8e23

add training tracking

c0e4ae0

add image logging to training

93cc9cd

simplify args and transforms; remove unneeded transformer code

ff937fa

minor plotting fixes

e7bafce

add initial vqgan eval script

b845834

minor eval conditions fixes

c543345

arthurdrake1 requested a review from ffelten September 30, 2025 16:58

ruff fixes and add dependencies

ef8aa72

arthurdrake1 force-pushed the vqgan branch from 73f1a44 to ef8aa72 Compare October 2, 2025 09:14

arthurdrake1 added 2 commits October 2, 2025 11:20

add early stopping arg for transformer

992a3fd

fix early stopping model saving

f0b36d8

ffelten requested changes Oct 17, 2025

View reviewed changes

arthurdrake1 added 2 commits October 18, 2025 21:59

clarify var names, add comments, other minor fixes

a0437b4

move helper blocks and utils to their own file

a29fd33

ffelten approved these changes Oct 22, 2025

View reviewed changes

arthurdrake1 added 2 commits October 22, 2025 18:01

remove default comments from args

d88a48b

remove unused args

5577396

arthurdrake1 merged commit fe036d1 into main Oct 27, 2025
3 checks passed

arthurdrake1 deleted the vqgan branch October 27, 2025 14:59

markfuge mentioned this pull request Jan 7, 2026

Wanted algorithms #39

Open

6 tasks



		###########################################
		########## GPT-2 BASE CODE BELOW ##########

Conversation

arthurdrake1 commented Sep 30, 2025

Uh oh!

ffelten left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

arthurdrake1 commented Oct 17, 2025

Uh oh!

ffelten left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ffelten left a comment •

edited

Loading