Skip to content

training question #9

@Cs231ncv

Description

@Cs231ncv

Hello, I have some questions regarding the training constraints of this paper. I noticed that your paper mentioned that the final epoch is 300 and is divided into Adam optimization and SGD mode. However, in actual training, it seems difficult to enter SGD mode due to the oscillation of loss. Did you encounter such a situation at that time? I would greatly appreciate it if I could receive your answer

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions