Skip to content

Why clamp inputs to all tanh calls? #1

Description

@dhruvdcoder

return tf.tanh(tf.minimum(tf.maximum(x, -MAX_TANH_ARG), MAX_TANH_ARG))

We are trying to reimplement the layers proposed by the Hyperbolic Neural Networks paper. We use float64 instead of float32 for the entire model and inputs. Hence, we avoid numerical instability. However, if we do not clamp the inputs to the tanh functions between (-15, 15), the network does not seem to train at all. It would be great if you could provide a reason for doing this and for picking the value of 15.

PS: I really liked the paper and thank you for making the code available.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions