Skip to content

(Maybe) in-consistency between VQ-VAE paper and its implementation.  #252

@Apollo1840

Description

@Apollo1840

FIrst of all, maybe it is my misunderstanding of the paper, so hope somebody could explain it for me, thanks! :


in the paper, the loss is defined as
Screenshot from 2022-08-30 11-52-26

where e is the codebook defined at the beginning of the Section:
Screenshot from 2022-08-30 11-57-36

So, in the paper, the codebook loss and commitment loss are MSE between z_e(x) and e.

However, in the implementation, they are implemented as MSE between z_e(x)(inputs) and z_q(x)(quantized), where variable quantized means quantized encoding of the image, namely z_q:
Screenshot from 2022-08-30 11-58-19

Are they actually the same thing? why?

  • If the paper stated is right. how the dimension matches between z_e(x)(H' * W' * D) and e(K * D)?
  • if the implementation is right. how z_q(x)(quantized) backprop since its calculation contains argmin?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions