Hello, in the paper you mentioned that for task-specific fine-tuning of ESM-1B, an additional token was added to represent the entire enzyme, and this enzyme token was initially set to the element-wise mean of the amino acid representations. Does this imply that the new weight in nn.Embedding() was initialized using this mean value? However, in the code, I did not observe direct manipulation of the ESM-1B embeddings. Instead, used the token at position [0] representation. Does this mean the CLS token was adopted as the global enzyme representation instead?
Hello, in the paper you mentioned that for task-specific fine-tuning of ESM-1B, an additional token was added to represent the entire enzyme, and this enzyme token was initially set to the element-wise mean of the amino acid representations. Does this imply that the new weight in
nn.Embedding()was initialized using this mean value? However, in the code, I did not observe direct manipulation of the ESM-1B embeddings. Instead, used the token atposition [0]representation. Does this mean theCLStoken was adopted as the global enzyme representation instead?