(Maybe) in-consistency between VQ-VAE paper and its implementation. 

FIrst of all, maybe it is my misunderstanding of the paper, so hope somebody could explain it for me, thanks! :

---

in the paper, the loss is defined as 
![Screenshot from 2022-08-30 11-52-26](https://user-images.githubusercontent.com/28777429/187407855-e36af2f8-58c1-497d-b425-6fb3842ebe91.png)

where `e` is the codebook defined at the beginning of the Section: 
![Screenshot from 2022-08-30 11-57-36](https://user-images.githubusercontent.com/28777429/187408189-1dd44b68-d601-4b66-97d7-f726c5895cbe.png)

So, in the paper, the codebook loss and commitment loss are MSE **between  `z_e(x)` and `e`**.

However, in the implementation, they are implemented as MSE **between `z_e(x)`(inputs) and `z_q(x)`(quantized)**, where variable quantized means quantized encoding of the image, namely `z_q`:
![Screenshot from 2022-08-30 11-58-19](https://user-images.githubusercontent.com/28777429/187408392-b4a2fbfd-d0d6-42fb-8f21-45beb2de6937.png)

Are they actually the same thing? why?

- If the paper stated is right. how the dimension matches between `z_e(x)`(H' * W' * D) and `e`(K * D)?
- if the implementation is right. how `z_q(x)`(quantized) backprop since its calculation contains argmin?






Provide feedback

Saved searches

Use saved searches to filter your results more quickly

(Maybe) in-consistency between VQ-VAE paper and its implementation. #252

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

(Maybe) in-consistency between VQ-VAE paper and its implementation. #252

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions