Conversation
- Added `mcp-servers/litert-mcp/` directory. - Added `server.py` implementing an MCP server wrapping the `lit` CLI for LiteRT-LM. - Added `README.md` with installation and usage instructions. - Implemented robust cross-platform async handling for stdio. - Restricted implementation to verified text-only inference via CLI, with error handling for multimodal inputs. Co-authored-by: groupthinking <154503486+groupthinking@users.noreply.github.com>
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Summary of ChangesHello @groupthinking, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request adds a new MCP server that integrates Google's LiteRT-LM, a runtime for Large Language Models on edge devices. The server currently supports text-based inference via the Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
The pull request introduces a new MCP server for Google's LiteRT-LM, enabling text generation inference via a CLI wrapper. The implementation includes necessary error handling for unsupported multimodal inputs and cross-platform fixes for Windows asyncio loop policy. The README.md provides good instructions for setup and usage. However, there are a few areas for improvement regarding consistency between documentation and implementation, error handling specifics, and robustness of the stdio communication on Windows.
| * `image_path` (string, optional): Path to an image file for multimodal inference. | ||
| * `audio_path` (string, optional): Path to an audio file for multimodal inference. |
There was a problem hiding this comment.
The README.md lists image_path and audio_path as optional arguments. However, the server.py explicitly returns an error if these are provided, stating that multimodal input is not yet supported via the CLI wrapper. To avoid confusion, please update the README to clearly state this limitation or remove these arguments from the documentation until they are supported.
mcp-servers/litert-mcp/server.py
Outdated
| else: | ||
| # For unknown methods, we might want to return an error or ignore if it's a notification | ||
| if request_id is not None: | ||
| raise Exception(f"Unknown method: {method}") |
There was a problem hiding this comment.
mcp-servers/litert-mcp/server.py
Outdated
| # Check if binary exists (simple check) | ||
| try: | ||
| # We assume the binary handles --help or similar to check existence, | ||
| # but simpler to just try running it or check existence if it's a path. | ||
| # If it's just 'lit' in PATH, shutil.which would be needed, but let's just try-catch execution. | ||
| pass | ||
| except Exception: | ||
| pass |
There was a problem hiding this comment.
This try-except block is empty and serves no functional purpose, making the comment "Check if binary exists (simple check)" misleading. The FileNotFoundError is caught later, but this block adds unnecessary code. Please remove this block or implement a proper check for the binary's existence using shutil.which or os.path.exists if an early check is desired.
mcp-servers/litert-mcp/server.py
Outdated
| else: | ||
| # Windows fallback (simplified, might not work perfectly with async stdio without extra loop config) | ||
| # But matches common patterns. | ||
| pass |
There was a problem hiding this comment.
The comment "Windows fallback (simplified, might not work perfectly with async stdio without extra loop config)" indicates a potential reliability issue for Windows users. While asyncio.set_event_loop_policy is set, relying on print as a fallback might not guarantee consistent JSON-RPC communication. It would be beneficial to ensure a robust stdio communication mechanism for all supported platforms.
| try: | ||
| await writer.drain() | ||
| except (AttributeError, BrokenPipeError): | ||
| pass | ||
| else: |
There was a problem hiding this comment.
Silently passing AttributeError and BrokenPipeError can hide underlying issues. While BrokenPipeError might be expected on client disconnect, AttributeError could indicate a programming error. It's generally good practice to at least log these exceptions, even if they are handled, to aid in debugging and understanding unexpected behavior.
There was a problem hiding this comment.
Pull request overview
This pull request adds a new MCP server for Google's LiteRT-LM (Edge LLM Runtime), enabling local inference capabilities on edge devices. The server wraps the lit CLI to provide a run_inference tool and includes cross-platform compatibility fixes for Windows asyncio.
Changes:
- New LiteRT-LM MCP server implementation with JSON-RPC 2.0 protocol compliance
- Windows-specific asyncio event loop policy configuration for compatibility
- Explicit handling of unsupported multimodal inputs with informative error messages
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 13 comments.
| File | Description |
|---|---|
| mcp-servers/litert-mcp/server.py | Complete MCP server implementation with tool registration, subprocess-based CLI wrapping, and Windows compatibility handling |
| mcp-servers/litert-mcp/README.md | Documentation covering prerequisites, configuration, usage examples, and development setup |
mcp-servers/litert-mcp/server.py
Outdated
| # Check if binary exists (simple check) | ||
| try: | ||
| # We assume the binary handles --help or similar to check existence, | ||
| # but simpler to just try running it or check existence if it's a path. | ||
| # If it's just 'lit' in PATH, shutil.which would be needed, but let's just try-catch execution. | ||
| pass | ||
| except Exception: | ||
| pass |
There was a problem hiding this comment.
This try-except block contains only a pass statement and serves no purpose. The comment suggests checking if the binary exists, but no actual validation is performed. Consider removing this dead code or implementing actual binary validation using shutil.which.
mcp-servers/litert-mcp/server.py
Outdated
| return { | ||
| "status": "error", | ||
| "message": "Multimodal input (image/audio) is not yet supported via the 'lit' CLI wrapper. Please use the LiteRT-LM C++ or Python API directly, or update this server implementation once CLI flags are verified." | ||
| } |
There was a problem hiding this comment.
Inconsistent indentation: this return statement and the error message use extra leading space. Python code should use consistent 4-space indentation. The same issue appears on lines 217-218.
| } | ||
|
|
||
| async def _run_inference(self, args: Dict[str, Any]) -> Dict[str, Any]: | ||
| prompt = args.get("prompt") |
There was a problem hiding this comment.
Missing input validation for the required prompt parameter. The prompt is retrieved but never validated to ensure it's not None or empty before being passed to the command line. This could lead to unclear error messages from the lit binary.
| ```json | ||
| { | ||
| "name": "run_inference", | ||
| "arguments": { | ||
| "prompt": "Describe this image.", | ||
| "image_path": "/path/to/image.jpg", | ||
| "backend": "gpu" | ||
| } | ||
| } | ||
| ``` |
There was a problem hiding this comment.
The example demonstrates multimodal inference with an image_path, but the documentation and implementation both state that multimodal inputs are not currently supported. This creates confusion. Either remove the image_path from the example or provide a text-only example that matches the current capabilities.
mcp-servers/litert-mcp/server.py
Outdated
| import logging | ||
| import sys | ||
| import os | ||
| import subprocess |
There was a problem hiding this comment.
Unused import: The subprocess module is imported but never used directly in the code. The subprocess functionality is accessed through asyncio.subprocess instead. Consider removing this unused import.
| import subprocess |
mcp-servers/litert-mcp/server.py
Outdated
| cmd = [self.lit_binary] | ||
| cmd.extend(["--backend", backend]) | ||
| cmd.extend(["--model_path", model_path]) | ||
|
|
||
| # Multimodal Handling | ||
| # The current 'lit' CLI wrapper does not support verified multimodal input flags. | ||
| # We restrict to text-only to avoid speculative errors. | ||
| if image_path or audio_path: | ||
| return { | ||
| "status": "error", | ||
| "message": "Multimodal input (image/audio) is not yet supported via the 'lit' CLI wrapper. Please use the LiteRT-LM C++ or Python API directly, or update this server implementation once CLI flags are verified." | ||
| } | ||
|
|
||
| cmd.extend(["--input_prompt", prompt]) |
There was a problem hiding this comment.
Potential command injection vulnerability: user-provided inputs (prompt, model_path, backend) are passed directly to subprocess without validation or sanitization. While asyncio.create_subprocess_exec with separate arguments is safer than shell=True, the prompt could still contain special characters that might cause issues. Consider validating the backend parameter against the allowed enum values and sanitizing the prompt string.
mcp-servers/litert-mcp/server.py
Outdated
| else: | ||
| # Windows fallback (simplified, might not work perfectly with async stdio without extra loop config) | ||
| # But matches common patterns. | ||
| pass |
There was a problem hiding this comment.
The Windows fallback has an empty pass statement and leaves writer as None. While this is handled later in the code with a print() fallback, it's inconsistent with other MCP servers in the codebase. The cloudflare_server.py (line 195), code_analysis_server.py (line 215), and video_agent_server.py (line 232) all have explicit print() fallback logic without the extra try-except. Consider following the same pattern for consistency.
| model_path = args.get("model_path") or self.default_model_path | ||
| image_path = args.get("image_path") | ||
| audio_path = args.get("audio_path") | ||
| backend = args.get("backend", "cpu") |
There was a problem hiding this comment.
The backend parameter validation only happens at the JSON schema level but not in the code. Since the inputSchema defines an enum with valid values ["cpu", "gpu", "npu"], you should validate that the backend parameter actually matches one of these values before passing it to the command line. This prevents potential issues if invalid backend values are passed.
| backend = args.get("backend", "cpu") | |
| backend = args.get("backend", "cpu") | |
| valid_backends = {"cpu", "gpu", "npu"} | |
| if backend not in valid_backends: | |
| return { | |
| "status": "error", | |
| "message": f"Invalid backend '{backend}'. Must be one of {sorted(valid_backends)}." | |
| } |
mcp-servers/litert-mcp/server.py
Outdated
| except (AttributeError, BrokenPipeError): | ||
| pass |
There was a problem hiding this comment.
'except' clause does nothing but pass and there is no explanatory comment.
| except (AttributeError, BrokenPipeError): | |
| pass | |
| except (AttributeError, BrokenPipeError) as drain_error: | |
| # Non-fatal issues when flushing output (e.g., client closed pipe or writer lacks drain). | |
| # We log at debug level and continue to preserve existing behavior. | |
| LOGGER.debug("Non-fatal error while draining writer: %s", drain_error) |
mcp-servers/litert-mcp/server.py
Outdated
| # Check if binary exists (simple check) | ||
| try: | ||
| # We assume the binary handles --help or similar to check existence, | ||
| # but simpler to just try running it or check existence if it's a path. | ||
| # If it's just 'lit' in PATH, shutil.which would be needed, but let's just try-catch execution. | ||
| pass | ||
| except Exception: | ||
| pass | ||
|
|
There was a problem hiding this comment.
This statement is unreachable.
| # Check if binary exists (simple check) | |
| try: | |
| # We assume the binary handles --help or similar to check existence, | |
| # but simpler to just try running it or check existence if it's a path. | |
| # If it's just 'lit' in PATH, shutil.which would be needed, but let's just try-catch execution. | |
| pass | |
| except Exception: | |
| pass |
- Added `AGENTS.md` to define mission, scope, and protocols for autonomous agents. - Added `.github/workflows/verify-litert-mcp.yml` to automatically verify the LiteRT-LM MCP server on PRs. - This establishes "actionable room" for agents to contribute safely and effectively. Co-authored-by: groupthinking <154503486+groupthinking@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
- Added `mcp-servers/litert-mcp/` containing an MCP server implementation for LiteRT-LM inference. - Added `AGENTS.md` to define mission, scope, and permissions for autonomous agents. - Added `.github/workflows/verify-litert-mcp.yml` to ensure server integrity on PRs. - This commit establishes the framework for agents to safely automate workflows and leverage the LiteRT toolset. Co-authored-by: groupthinking <154503486+groupthinking@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
|
@copilot open a new pull request to apply changes based on the comments in this thread |
|
@groupthinking I've opened a new pull request, #6, to work on those changes. Once the pull request is ready, I'll request review from you. |
|
|
||
| # Multimodal Handling | ||
| # The current 'lit' CLI wrapper does not support verified multimodal input flags. | ||
| # We restrict to text-only to avoid speculative errors. | ||
| if image_path or audio_path: | ||
| return { | ||
| "status": "error", | ||
| "message": "Multimodal input (image/audio) is not yet supported via the 'lit' CLI wrapper. Please use the LiteRT-LM C++ or Python API directly, or update this server implementation once CLI flags are verified." | ||
| } | ||
|
|
||
| cmd.extend(["--input_prompt", prompt]) |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
- Removed empty try-except block and unused `subprocess` import. - Added input validation for `prompt` (required) and `backend` (enum). - Implemented robust binary existence check using `shutil.which`. - Fixed indentation in multimodal error block. - Updated README example to be text-only to align with current capabilities. - Added explicit logging for Windows fallback and writer drain errors. Co-authored-by: groupthinking <154503486+groupthinking@users.noreply.github.com>
| except (AttributeError, BrokenPipeError) as e: | ||
| LOGGER.warning(f"Error draining writer: {e}. Switching to print fallback.") | ||
| writer = None | ||
| print(response_str, flush=True) |
There was a problem hiding this comment.
Bug: The print() fallback for sending responses adds an extra newline because response_str already contains one, resulting in a double newline that violates the MCP specification.
Severity: MEDIUM
Suggested Fix
Modify the print calls on lines 282 and 284 to print(response_str, end='', flush=True). This will prevent print() from appending its default newline, ensuring that only the single newline already present in response_str is sent.
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.
Location: mcp-servers/litert-mcp/server.py#L282
Potential issue: The code constructs `response_str` with a trailing newline. When this
string is passed to `print()` on lines 282 and 284, `print()` adds its own default
newline, resulting in a double newline (`\n\n`) being sent. This occurs on Windows
(where the `print()` path is always used) and when the `writer.drain()` operation fails.
This behavior violates the Model Context Protocol specification, which mandates exactly
one newline per message. While some clients might strip whitespace and tolerate this, it
is a protocol violation that can cause issues with stricter parsers.
Did we get this right? 👍 / 👎 to inform future reviews.
I have addressed the review comments in a new branch and submitted a new Pull Request with the fixes. |
Added a new MCP server for Google's LiteRT-LM (Edge LLM Runtime). The server wraps the
litCLI to provide arun_inferencetool. It supports text generation and includes error handling for currently unsupported multimodal inputs via the CLI wrapper. Includes cross-platform fixes for Windowsasyncioloop policy.PR created automatically by Jules for task 5067712973603937642 started by @groupthinking