Thanks for your remarkable work.
I have a problem about the Eq.10 in your arXiv paper.
The equation is about calculating retained set using pivot tokens, I understand the retained tokens should have low similarity with pivots, but the latter union operation confuses me, because it means if one token is similar with any pivot, it will be retained even it has high duplication scores with other pivots. Is that reasonable ? It seems that the correct way is to use intersection operation to retain tokens that has low duplication score from all pivot ?
Hope to get response from you. Thanks.
Thanks for your remarkable work.
I have a problem about the Eq.10 in your arXiv paper.
The equation is about calculating retained set using pivot tokens, I understand the retained tokens should have low similarity with pivots, but the latter union operation confuses me, because it means if one token is similar with any pivot, it will be retained even it has high duplication scores with other pivots. Is that reasonable ? It seems that the correct way is to use intersection operation to retain tokens that has low duplication score from all pivot ?
Hope to get response from you. Thanks.