Skip to content

Harden iFlow proxy requests to match CLI behavior#142

Open
redzrush101 wants to merge 6 commits intoMirrowel:devfrom
redzrush101:iflow-cli-spoof-port
Open

Harden iFlow proxy requests to match CLI behavior#142
redzrush101 wants to merge 6 commits intoMirrowel:devfrom
redzrush101:iflow-cli-spoof-port

Conversation

@redzrush101
Copy link

@redzrush101 redzrush101 commented Feb 26, 2026

Summary

  • Port iFlow CLI-style anti-block request behavior into the iFlow provider used by LLM-API-Key-Proxy.
  • Add signed-header handling with a one-shot 406 retry without signature plus upstream API base fallback support.
  • Fix env-based OAuth credential loading/refresh handling and include glm-5 in iFlow hardcoded models.

Validation

  • Ran compile checks for modified provider files.
  • Started proxy with OAuth creds provided via environment variables from oauth_creds.json mapping.
  • Verified streaming calls through proxy endpoint for iflow/glm-5 and iflow/kimi-k2.5.

Important

Enhance iFlow proxy requests to match CLI behavior with improved signed-header handling, OAuth credential management, and API base fallback support.

  • Behavior:
    • Port CLI-style anti-block request behavior into iFlow provider in iflow_provider.py.
    • Add signed-header handling with a one-shot 406 retry without signature and upstream API base fallback.
    • Fix OAuth credential loading/refresh from environment variables in iflow_auth_base.py.
  • Models:
    • Include glm-5 in hardcoded models in iflow_provider.py.
  • Functions:
    • Add get_api_base_candidates() and get_api_base() in iflow_auth_base.py for API base URL management.
    • Modify _refresh_token() in iflow_auth_base.py to handle env-loaded credentials without file IO.
    • Update get_api_details() in iflow_auth_base.py to support env-based credentials.
    • Enhance _build_iflow_headers() in iflow_provider.py to include optional signature headers.
    • Implement _should_fallback_base() in iflow_provider.py for handling specific HTTP status codes.
  • Misc:
    • Verify streaming calls for iflow/glm-5 and iflow/kimi-k2.5 through proxy endpoint.

This description was created by Ellipsis for 97e1449. You can customize this summary. It will automatically update as commits are pushed.

Add signed-header fallback and base-url failover while fixing env OAuth credential loading/refresh so proxy calls are less likely to get blocked.
@Mirrowel
Copy link
Owner

Gotta give it a couple of days to verify

MasuRii added a commit to MasuRii/LLM-API-Key-Proxy that referenced this pull request Feb 27, 2026
Resolve conflicts in iFlow provider by preserving existing retry/context-failure handling while integrating PR Mirrowel#142 signed-header fallback and base URL failover logic.
@redzrush101
Copy link
Author

Gotta give it a couple of days to verify

ye good

Consolidate follow-up fixes after initial hardening: add captured-header parity and metadata propagation, preserve usage details, introduce sticky session/conversation IDs, and fix reasoning regressions by removing forced disabled-thinking defaults while enabling thinking by default.
yassin and others added 4 commits February 27, 2026 16:34
…ication

Include httpx.RemoteProtocolError in the error type tuple for
api_connection classification. This error occurs when a peer closes the
connection without sending a complete message and should be treated as a
transient connection issue rather than an unhandled exception.
…handler

Wrap stream_handler in a retry loop with exponential backoff for
transient connection errors (RemoteProtocolError, ConnectError,
ReadTimeout, NetworkError). Retries up to 3 times with 1s/2s/4s
backoff before re-raising for higher-level error handling.

Changes:
- Add CONNECTION_ERROR_TYPES tuple and CONTEXT_WINDOW_ERROR_PATTERNS
  as module-level constants for reuse and clarity
- Add MAX_CONNECTION_RETRIES and RETRY_BACKOFF_BASE configuration
- Restructure stream_handler with while-loop retry around the stream
  context manager, re-creating the stream on each retry attempt
- Add context window error detection from HTTP error response bodies
  to surface token limit issues explicitly
- Import asyncio for async sleep during backoff
- Detect empty choices array in HTTP 200 responses
- Detect zero completion_tokens with non-zero prompt_tokens
- Detect empty assistant messages (no content/reasoning/tool_calls)
- Detect streams that complete without any data chunks
- Raise non-retryable context_window_exceeded error for these cases
- Prevents quota waste from repeated failed requests
@Mirrowel
Copy link
Owner

Mirrowel commented Mar 3, 2026

@redzrush101 Integrate #130 please, by adding minimax and anything else so this can be merged.
Don't need the garbage models like qwen 32B or tstars

Also, is #139 integrated? And #136?

Repository owner deleted a comment from mirrobot-agent bot Mar 3, 2026
Repository owner deleted a comment from mirrobot-agent bot Mar 3, 2026
@mirrobot-agent
Copy link
Contributor

mirrobot-agent bot commented Mar 3, 2026

Starting my review of the iFlow proxy hardening PR. Diving into the anti-block request behavior, OAuth handling, and sticky session implementation now.

Copy link
Contributor

@mirrobot-agent mirrobot-agent bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall Assessment

This is a solid PR that significantly hardens the iFlow provider with CLI-matching anti-block behavior, improved OAuth handling, and robust fallback mechanisms. The changes are well-structured and address real-world reliability concerns.

Architectural Feedback

The approach of matching CLI behavior is sound. A few areas worth addressing:

  1. Async compatibility: The sticky session cache uses threading.Lock() in async context
  2. Security: Token exposure in query params
  3. Reliability: Retry logic could benefit from jitter

Key Suggestions

  • Line 202 in iflow_provider.py: Replace threading.Lock() with asyncio.Lock() to prevent event loop blocking
  • Line 655 in iflow_auth_base.py: Remove accessToken from query params (headers-only is more secure)
  • Line 1763 in iflow_provider.py: Add jitter to exponential backoff
  • Line 1472 in iflow_provider.py: Add clarifying comment about the fake request/response pattern

Nitpicks and Minor Points

  • The in-place mutation in _inject_iflow_metadata_from_incoming_headers could use documentation
  • The __del__ cleanup pattern in transaction logger is unreliable; consider explicit cleanup or context managers

Questions for the Author

  1. Was the dual token placement (headers + query params) intentional for compatibility reasons, or an oversight?
  2. Any specific reason for choosing threading.Lock() over asyncio.Lock() for the sticky cache?

This review was generated by an AI assistant.

)


def _inject_iflow_metadata_from_incoming_headers(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function _inject_iflow_metadata_from_incoming_headers modifies request_data in-place. Consider documenting this side effect in the docstring to make the mutation explicit.

)
),
)
self._sticky_lock = threading.Lock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sticky session cache uses threading.Lock() in an async context (called from make_request which is async). This could block the event loop during lock contention. Consider using asyncio.Lock() instead for proper async compatibility.

return {k: v for k, v in message.__dict__.items() if not k.startswith("_")}
return {}

def _raise_silent_context_failure(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The _raise_silent_context_failure method creates a fake httpx request/response to raise HTTPStatusError. This is clever but could be confusing. Consider adding a comment explaining why this pattern is used instead of a custom exception.

cause=e,
)

backoff = RETRY_BACKOFF_BASE * (2 ** (connection_retry_count - 1))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The connection retry logic uses exponential backoff but doesn't implement jitter. Consider adding random jitter (e.g., backoff * (0.5 + random.random())) to prevent thundering herd when multiple requests retry simultaneously.

response = await client.get(
IFLOW_USER_INFO_ENDPOINT,
headers=headers,
params={"accessToken": access_token},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The _fetch_user_info method sends accessToken in both headers and query params. This could be a security concern if the token is logged in server access logs. Consider using only headers (preferred) or document why both are needed.

for filename in list(self._append_handles.keys()):
self._close_append_handle(filename)

def __del__(self) -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The __del__ method calls close_append_files(), but relying on __del__ for cleanup is unreliable in Python (may not be called promptly or at all). Consider using context managers or explicit cleanup calls in addition to __del__.

Repository owner deleted a comment from mirrobot-agent bot Mar 3, 2026
@Mirrowel
Copy link
Owner

Mirrowel commented Mar 3, 2026

@mirrobot-agent also answer this comment: #142 (comment). From before, about other PRs.

What is here vs those other PRs, what is not, etc. Do not do any changes - just review and answer.

And do not just skim descriptions - read the diffs and verify as well vs latest dev branch.

@mirrobot-agent
Copy link
Contributor

mirrobot-agent bot commented Mar 3, 2026

@Mirrowel, I've analyzed PR #142 against the latest dev branch and reviewed the three PRs you mentioned. Here's the breakdown:

PR Integration Status

PR #136 (fix/iflow-cookie-auth-new-accounts) - INTEGRATED

PR #130 (feat/iflow-add-new-models) - NOT INTEGRATED

PR #139 (fix/iflow-connection-error-handling) - PARTIALLY INTEGRATED

Summary

PR #142 currently contains:

Missing from PR #142:

Recommendation: PR #142 needs minimax-m2.5 added to be ready for merge per your requirements.

This analysis was generated by an AI assistant.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants