Hi! First of all, thank you for this research, it is very fascinating.
I have a question about your implementation of the hyperbolic MLR function.
In your implementation, you define it as:
2./ np.sqrt(c) * |A_mlr| * arcsinh(np.sqrt(c) * pxdota * lambda_px)
First, my question regards the l2 normalization of A_mlr in creating pxdota, why do you do this?
Second, seeing how lambda_px = 2. / 1 - c * |minus_p_plus_x|^2, I find it difficult to see how your implementation of the hyperbolic MLR is equivalent to the definition of
P(y=k | x) in your paper:
lambda_p * |A_mlr| / np.sqrt(c) * arcsinh(2*np.sqrt(c) * pxdota /( (1 - c * |minus_p_plus_x|^2)*|A_mlr|)
= 2./(np.sqrt(c)(1 -c* |p|^2)) * |A_mlr| * arcsinh(np.sqrt(c) * pxdota * lambda_px / |A_mlr|)
It seems to me the 1/(1-c * |p|^2) before the asinh term, and the 1/|A_mlr| term in the asinh term are missing, but I can't figure out where they went!
Does it have to do with the fact that the variable A_mlr first needs to be scaled by (lambda_0 / lambda_p) to be able to optimize it as a euclidean parameter?
I am currently writing a paper that makes extensive use of the definitions in your paper, and like to keep the implementation as close as possible to yours.
EDIT: I just realized that the l2_normalization is the implicit 1/|A_mlr|. This just leaves the 1/(1-c*|p|^2) that is missing.
Hi! First of all, thank you for this research, it is very fascinating.
I have a question about your implementation of the hyperbolic MLR function.
In your implementation, you define it as:
2./ np.sqrt(c) * |A_mlr| * arcsinh(np.sqrt(c) * pxdota * lambda_px)First, my question regards the l2 normalization of A_mlr in creating pxdota, why do you do this?
Second, seeing how
lambda_px = 2. / 1 - c * |minus_p_plus_x|^2, I find it difficult to see how your implementation of the hyperbolic MLR is equivalent to the definition ofP(y=k | x) in your paper:
lambda_p * |A_mlr| / np.sqrt(c) * arcsinh(2*np.sqrt(c) * pxdota /( (1 - c * |minus_p_plus_x|^2)*|A_mlr|)=
2./(np.sqrt(c)(1 -c* |p|^2)) * |A_mlr| * arcsinh(np.sqrt(c) * pxdota * lambda_px / |A_mlr|)It seems to me the
1/(1-c * |p|^2)before the asinh term, and the1/|A_mlr|term in the asinh term are missing, but I can't figure out where they went!Does it have to do with the fact that the variable A_mlr first needs to be scaled by (lambda_0 / lambda_p) to be able to optimize it as a euclidean parameter?
I am currently writing a paper that makes extensive use of the definitions in your paper, and like to keep the implementation as close as possible to yours.
EDIT: I just realized that the l2_normalization is the implicit
1/|A_mlr|. This just leaves the1/(1-c*|p|^2)that is missing.