Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
a648f11
feat(mcp): setup mcp app for vision agent
adi-wan-askui May 15, 2025
67c4f47
refactor(chat): extract threads and messages api
adi-wan-askui May 16, 2025
2907198
feat(chat): add thread deletion
adi-wan-askui May 16, 2025
15c948c
feat(tools)!: add `PynputAgentOs`
adi-wan-askui May 19, 2025
185f920
feat(tools,chat): add click event listening/recording
adi-wan-askui May 20, 2025
b100ebe
chore: add cursor rules
adi-wan-askui Jun 3, 2025
1fc7e83
feat(chat): wrap api into fastapi app
adi-wan-askui Jun 3, 2025
73ee5c2
refactor(models): extract methods etc. in computer agents
adi-wan-askui Jun 3, 2025
d180edc
feat(agent)!: allow passing messages and callbacks to `VisionAgent.ac…
adi-wan-askui Jun 3, 2025
7326a98
style(chat): fix linting issues
adi-wan-askui Jun 3, 2025
f589135
feat: add runs endpoints to chat api
adi-wan-askui Jun 3, 2025
c842dbf
fix: call on_message on tool result message
adi-wan-askui Jun 4, 2025
d6fcf84
feat!: make agents api work
adi-wan-askui Jun 5, 2025
00cbbd3
feat(chat): add streaming support to creating runs
adi-wan-askui Jun 5, 2025
752e612
chore: structure deps better, e.g., making pynput optional
adi-wan-askui Jun 5, 2025
7ed5878
feat!(chat): remove unused image API
adi-wan-askui Jun 5, 2025
4b4e017
fix(chat): fix linting, status codes, return types etc.
adi-wan-askui Jun 5, 2025
09993c0
fix(models): add logging for tool result messages removed by accident
adi-wan-askui Jun 5, 2025
87893c2
docs(models,tools): improve docstrings for new features
adi-wan-askui Jun 5, 2025
c6fbeee
feat(agent): fix logging and reporting of act()
adi-wan-askui Jun 5, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
74 changes: 74 additions & 0 deletions .cursorrules
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
# Code Style
- Use `_` prefix for all private variables, constants, functions, methods, properties, etc. that don't need to be accessible from outside the module
- Mark everything as private that does not absolutely need to be accessible from outside the module
- Use `@override` (imported from `typing_extensions`) decorator for all methods that override a parent class method
- Use type hints for all variables, functions, methods, properties, return values, parameters etc.
- Omit `Any` within type hints unless absolutely necessary
- Use built-in types (e.g., `list`, `dict`, `tuple`, `set`, `str | None`) instead of types from `typing` module (e.g. `List`, `Dict`, `Tuple`, `Set`, `Optional`, `Union`) wherever possible
- Instead of `Optional` use `| None`
- Create a `__init__.py` file in each folder
- Never pass literals, e.g., `error_msg`, directly to `Exceptions`, but instead assign them to variables and pass them to the exception, e.g., `raise FileNotFoundError(error_msg)` instead of `raise FileNotFoundError(f"Thread {thread_id} not found")`

## FastAPI
- Instead of defining `response_model` within route annotation, use the model as the response type in the function signature
- Do not assign `None` to dependencies but instead move it before arguments with default values

# Testing
- Use `pytest-mock` for mocking in tests wherever you need to mock something and pytest-mock can do the job.

# Documentation

## Docstrings
- All public functions, constants, classes, types etc. should have docstrings
- Document the constructor (`__init__`) args as part of the class docstring
- Omit the `__init__` docstring
- All function parameter should be documented with their type (followed by `, optional` if there is a default value) in parenthesis and description
- In descriptions, use backticks for all code references (variables, types, etc.), including types, e.g., `str`
- When referencing a function, use the function name in backticks plus parentheses, e.g., `click()`
- When referencing a class, use the class name in backticks, e.g., `VisionAgent`
- When referencing a method, use the class name in backticks plus the method name in parentheses, e.g., `VisionAgent.click()`
- When referencing a class attribute, use the class name in backticks plus the attribute name, e.g., `VisionAgent.display`
- Use `Example` section for code examples
- Use `Returns` section for return values
- Use `Raises` section for exceptions listing all possible exceptions that can be raised by the function
- Use `Notes` section for additional notes
- Use `See Also` section for related functions
- Use `References` section for references
- Use `Examples` section for code examples
- Example of a good docstring:
```python
def locate(
self,
locator: str | Locator,
screenshot: Img | None = None,
model: ModelComposition | str | None = None,
) -> Point:
"""
Find the position of the UI element identified by the `locator` using the `model`.

Args:
locator (str | Locator): The identifier or description of the element to locate.
screenshot (Img | None, optional): The screenshot to use for locating the
element. Can be a path to an image file, a PIL Image object or a data URL.
If `None`, takes a screenshot of the currently selected screen.
model (ModelComposition | str | None, optional): The composition or name of
the model(s) to be used for locating the element using the `locator`.

Returns:
Point: The coordinates of a point on the element, usually the center of the element, as a tuple (x, y).

Raises:
ValueError: If the arguments are not of the correct type.
ElementNotFoundError: If no element can be found.

Example:
```python
from askui import VisionAgent

with VisionAgent() as agent:
point = agent.locate("Submit button")
print(f"Element found at coordinates: {point}")
```
"""
...
```
1 change: 1 addition & 0 deletions .nvmrc
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
22
1 change: 0 additions & 1 deletion .vscode/extensions.json
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,6 @@
"tamasfe.even-better-toml",
"visualstudioexptteam.vscodeintellicode",
"dongli.python-preview",
"mintlify.document",
"kaih2o.python-resource-monitor",
"littlefoxteam.vscode-python-test-adapter",
"almenon.arepl"
Expand Down
33 changes: 27 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -284,32 +284,48 @@ You can create and use your own models by subclassing the `ActModel` (used for `
Here's how to create and use custom models:

```python
import functools
from askui import (
ActModel,
GetModel,
LocateModel,
Locator,
ImageSource,
MessageParam,
ModelComposition,
ModelRegistry,
OnMessageCb,
Point,
ResponseSchema,
VisionAgent,
)
from typing import Type
from typing_extensions import override

# Define custom models
class MyActModel(ActModel):
def act(self, goal: str, model_choice: str) -> None:
@override
def act(
self,
messages: list[MessageParam],
model_choice: str,
on_message: OnMessageCb | None = None,
) -> None:
# Implement custom act logic, e.g.:
# - Use a different AI model
# - Implement custom business logic
# - Call external services
print(f"Custom act model executing goal: {goal}")
if len(messages) > 0:
goal = messages[0].content
print(f"Custom act model executing goal: {goal}")
else:
error_msg = "No messages provided"
raise ValueError(error_msg)

# Because Python supports multiple inheritance, we can subclass both `GetModel` and `LocateModel` (and even `ActModel`)
# to create a model that can both get and locate elements.
class MyGetAndLocateModel(GetModel, LocateModel):
@override
def get(
self,
query: str,
Expand All @@ -324,6 +340,7 @@ class MyGetAndLocateModel(GetModel, LocateModel):
return f"Custom response to query: {query}"


@override
def locate(
self,
locator: str | Locator,
Expand Down Expand Up @@ -366,11 +383,15 @@ You can also use model factories if you need to create models dynamically:

```python
class DynamicActModel(ActModel):
def act(self, goal: str, model_choice: str) -> None:
# Use api_key in implementation
@override
def act(
self,
messages: list[MessageParam],
model_choice: str,
on_message: OnMessageCb | None = None,
) -> None:
pass


# going to be called each time model is chosen using `model` parameter
def create_custom_model(api_key: str) -> ActModel:
return DynamicActModel()
Expand Down Expand Up @@ -410,7 +431,7 @@ The controller for the operating system.

```python
agent.tools.os.click("left", 2) # clicking
agent.tools.os.mouse(100, 100) # mouse movement
agent.tools.os.mouse_move(100, 100) # mouse movement
agent.tools.os.keyboard_tap("v", modifier_keys=["control"]) # Paste
# and many more
```
Expand Down
7 changes: 0 additions & 7 deletions act.py

This file was deleted.

Loading