Skip to content

Slow response times #28

@blevels

Description

@blevels

Which model?

Gemma 4 31B

Mac model + RAM

M1 Max 64GB

What happened?

Thanks for creating this repository with the capability I have been desiring for some time. I have one issue that I am not able to resolve (yet). Any assisistance you can provide would be greatly appreciated.

When prompting via the claude CLI, the responses are very slow (78.6s / 3m 23s) and the response doesn't return back to the claude CLI prompt immediately even through I see the response in the server logs. BTW...In order to get things to run properly without prompting for a login to Anthropic, I added the following to my ~.claude.json:

"primaryApiKey": "sk-local"

Any ideas why the responses are very slow? I tried adding the following setting to my ~/.claude/settings.json:

"CLAUDE_CODE_ATTRIBUTION_HEADER" : "0"

This did not change the slowness of the responses. In the posted screenshot you can see the overall time it took for the response was 3m 23s. In the logs, you can see tha tthe response was generated @ 06:55:23, however, in the last log you can see that it was not returned to the CLI until 06:57:03. I also notice the fan on my Mac M1 going crazy. I wonder if this machine has enough power to run the model.

Image

Relevant logs or error output

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions