|
return tf.tanh(tf.minimum(tf.maximum(x, -MAX_TANH_ARG), MAX_TANH_ARG)) |
We are trying to reimplement the layers proposed by the Hyperbolic Neural Networks paper. We use float64 instead of float32 for the entire model and inputs. Hence, we avoid numerical instability. However, if we do not clamp the inputs to the tanh functions between (-15, 15), the network does not seem to train at all. It would be great if you could provide a reason for doing this and for picking the value of 15.
PS: I really liked the paper and thank you for making the code available.
hyperbolic_nn/util.py
Line 26 in 45be2f6
We are trying to reimplement the layers proposed by the Hyperbolic Neural Networks paper. We use float64 instead of float32 for the entire model and inputs. Hence, we avoid numerical instability. However, if we do not clamp the inputs to the tanh functions between (-15, 15), the network does not seem to train at all. It would be great if you could provide a reason for doing this and for picking the value of 15.
PS: I really liked the paper and thank you for making the code available.