Skip to content

fix qwen3_next#58

Merged
Jintao-Huang merged 1 commit intomodelscope:mainfrom
Jintao-Huang:fix_qwen3_next
May 4, 2026
Merged

fix qwen3_next#58
Jintao-Huang merged 1 commit intomodelscope:mainfrom
Jintao-Huang:fix_qwen3_next

Conversation

@Jintao-Huang
Copy link
Copy Markdown
Collaborator

No description provided.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the forward method in qwen3_next.py to utilize resolve_gdn_attention_mask instead of resolve_hf_attention_mask. A review comment identifies potential logic errors and runtime risks within the resolve_gdn_attention_mask implementation, specifically regarding type safety when applying bitwise operators to integer tensors and potential dimensionality mismatches that could cause runtime errors.

else:
hidden_states = hidden_states.transpose(0, 1)
attention_mask = resolve_hf_attention_mask(kwargs)
attention_mask = resolve_gdn_attention_mask(kwargs)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The function resolve_gdn_attention_mask (called here and defined at line 62) has a potential logic error and runtime risk:

  1. Type Safety: At line 70, it uses the bitwise NOT operator ~ on attention_mask. If the mask is an integer tensor (common for HuggingFace masks), ~1 results in -2, which will cause the sum(...) > 0 check to behave unexpectedly. It should be explicitly cast to boolean: (~attention_mask.to(torch.bool)).
  2. Dimensionality: The function assumes a 4D input by using dim=(1, 2). If kwargs.get('attention_mask') returns a 2D tensor (standard HuggingFace format), this will raise an IndexError.

Since this function is now being used to resolve the attention mask, these issues should be addressed in its implementation to ensure robustness.

@Jintao-Huang Jintao-Huang merged commit 5d0d323 into modelscope:main May 4, 2026
1 check passed
Jintao-Huang added a commit that referenced this pull request May 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant