A Windows-focused device automation assistant powered by Google FunctionGemma. It accepts natural-language commands and turns them into system actions such as opening apps, changing settings, checking system status, managing folders, and launching maintenance tasks.
The project includes:
- a modern Tkinter desktop UI in app.py
- a command-processing backend and CLI runner in main.py
- a lightweight model smoke test in test_functiongemma.py
- an experimental notebook in notebook/functiongemma_270m_it.ipynb
The assistant is built around a two-stage command system:
- A fast rule-based parser handles common tasks like toggles, app launching, folder access, system information, and power actions.
- If keyword matching does not understand the request, the FunctionGemma model is used as a fallback to select the most suitable function.
That design keeps common actions quick while still allowing natural-language requests when the input is less structured.
The assistant can recognize requests such as:
- open an app like Notepad, Calculator, Chrome, Edge, VS Code, or File Explorer
- open Windows settings pages like Bluetooth, Wi-Fi, Display, Sound, Privacy, Windows Update, and more
- control basic system actions such as lock, sleep, restart, shutdown, and screenshot capture
- change volume, brightness, dark mode, night light, and taskbar behavior
- open common folders like Downloads, Documents, Desktop, and Temp
- show system details such as CPU usage, RAM usage, disk usage, uptime, Windows version, installed apps, and startup apps
- run maintenance actions such as Defender scans, update checks, temp cleanup, and recycle bin emptying
- open websites or perform web searches from natural language prompts
The project has three layers that work together:
app.py starts a Tkinter desktop window with:
- a chat-style conversation area
- a quick-actions sidebar for common commands
- a text input field for natural-language requests
- a background model-loading thread so the interface appears immediately
When you click a quick action, it sends the same kind of command text that you would type manually.
main.py contains the automation engine. The main flow is:
process_command()clears the log buffer and receives the user requestsmart_execute()tries keyword and pattern matching first- if nothing matches,
ai_fallback()asks FunctionGemma to choose a function execute_ai_function()dispatches the selected actionlog()stores messages for both the terminal and the UI
This means the assistant prefers direct, deterministic handling first and only uses the model when it needs help interpreting the request.
The backend calls Windows features through PowerShell, system settings URIs, shell commands, and native helpers. Examples include:
ms-settings:links for Settings pagesshutdown,taskkill, andrundll32for power and process control- PowerShell commands for battery, network, security, and system info
- browser launch commands for search and website actions
The typical runtime workflow looks like this:
- You launch the desktop app with
python app.pyor start the CLI withpython main.py. - The app loads the FunctionGemma model from Hugging Face.
- You type a request such as
open calculatororshow cpu usage. smart_execute()checks for a direct match and runs the action immediately if possible.- If the request is ambiguous,
ai_fallback()asks the model to choose one of the available functions. - The chosen helper function performs the Windows action and writes status messages to the log.
- The UI renders those messages in the chat window, or the CLI prints them in the terminal.
The result is a hybrid workflow: fast for predictable tasks, flexible for natural-language instructions.
This project is designed for Windows.
You will need:
- Python 3.10 or newer
- Windows 10 or Windows 11
- Internet access the first time you run the app so Hugging Face can download the model
- PowerShell available on the system
- Tkinter, which is usually bundled with standard Windows Python installers
Recommended environment:
- a GPU or a capable CPU if you want faster model startup and inference
- enough disk space for the downloaded model cache
- admin permissions for actions that touch security, power, or system settings
Python packages used by the project:
transformerstorchaccelerate
Depending on your local Python installation, you may also want jupyter for the notebook workflow.
- Clone the repository.
- Create and activate a virtual environment.
- Install the Python dependencies:
pip install --upgrade pip
pip install transformers torch accelerateIf you plan to use the notebook, install Jupyter as well:
pip install jupyterLaunch the Tkinter interface:
python app.pyThe UI provides a chat box, a quick-actions sidebar, and a status indicator while the model loads.
The sidebar is organized by workflow areas such as connectivity, audio, display, system tools, security, folders, and power. Each button sends a predefined command into the same backend pipeline used by free-form chat input.
Run the backend directly from the terminal:
python main.pyThen type natural-language commands such as:
open calculatorturn on bluetoothopen windows updatewhat is my ip addressshow cpu usageclear temp filesrestart my laptop
Type exit, quit, or bye to close the CLI session.
You can verify that the FunctionGemma model loads with:
python test_functiongemma.pyThis script is useful when you only want to confirm that the model can be fetched and loaded without opening the full app.
The assistant responds well to short, direct phrases:
turn on bluetoothturn off wifiopen display settingsopen calculatorshow battery levelshow installed appsclear temp filesrun quick virus scanopen downloadswhat is my ip addressrestart my laptop
You can also use less formal prompts such as I want to clean up my laptop or check my memory and cpu usage, and the fallback model will try to infer the right action.
flowchart TD
U[User input] --> V{UI or CLI}
V --> W[process_command()]
W --> X[smart_execute()]
X -->|direct match| Y[Windows action]
X -->|no match| Z[ai_fallback()]
Z --> A[FunctionGemma model]
A --> B[execute_ai_function()]
B --> Y
Y --> C[log() / get_and_clear_log()]
C --> D[Chat UI or terminal output]
- The project is intentionally Windows-specific, because many actions depend on Windows-only commands and settings URIs.
- Some commands open a settings page instead of flipping a system toggle automatically. That keeps the behavior predictable when Windows blocks direct automation.
- The model is loaded lazily in the GUI so the interface stays responsive while startup work happens in the background.
test_functiongemma.pyis a quick sanity check, while the notebook is better suited for experimentation or model exploration.
- app.py - Tkinter desktop UI for the assistant
- main.py - command parser, Windows automation helpers, and CLI entrypoint
- test_functiongemma.py - minimal model loading check
- notebook/functiongemma_270m_it.ipynb - notebook for experimenting with the model
- LICENSE - MIT license
- The project is Windows-specific. Most actions rely on Windows settings URIs, PowerShell,
taskkill,shutdown, and other Windows tools. - Some toggles open the relevant Windows settings page instead of fully automating the switch.
- Several actions may require administrator privileges or permission prompts.
- The first run may take a while because the model must be downloaded and initialized.
- The assistant is intended for local device automation only; it does not replace system security boundaries.
This project is released under the MIT License. See LICENSE for details.