Skip to content

AI providers, follow-up context, auto-update & release automation#2

Merged
opensourcebharat merged 18 commits into
mainfrom
feat/ai-integration
May 29, 2026
Merged

AI providers, follow-up context, auto-update & release automation#2
opensourcebharat merged 18 commits into
mainfrom
feat/ai-integration

Conversation

@opensourcebharat
Copy link
Copy Markdown
Owner

@opensourcebharat opensourcebharat commented May 29, 2026

Summary

  • AI fixes: prompts now solve (not just describe) image snips; fixed first "Ask AI" opening an empty chat; follow-up questions now retain full
    conversation context (image/OCR carried across turns until a new snip).
    • New providers: added DeepSeek (text reasoning) and Sarvam AI (Indic-strong Vision OCR → LLM solve); confirmed local→cloud fallback routing.
    • Auto-update: electron-updater from GitHub Releases — background download, install on quit, with a Settings → Updates panel (toggle, check,
      restart-to-update). Unsigned macOS falls back to a manual link.
    • Release automation: tag-driven npm run release[:minor|:major] (bumps version, rolls CHANGELOG, commits, tags, pushes) → release.yml builds
      macOS/Windows/Linux and publishes to GitHub Releases (draft → public). macOS .zip target added; signing wired to optional CI secrets.
    • Docs: README (AI + Updates), new docs/AI.md, rewritten RELEASING.md, CHANGELOG.

Type of change

  • feat — new user-visible feature
  • fix — bug fix
  • docs — documentation only
  • refactor — code change with no behaviour change
  • perf — performance improvement
  • chore — build / tooling / dependency bump
  • style — formatting / no behaviour change

How was this tested?

Screenshots / recording

Changelog

  • Added an entry under [Unreleased] in CHANGELOG.md
  • N/A — not user-visible

Checklist

Three coordinated improvements driven by user feedback that small
handwriting strokes were blobby, the arrow looked unfinished, and the
chrome felt dated.

Stroke engine — fine handwriting now works:
- perfect-freehand pencil thinning 0.08 → 0.04 and streamline 0.26
  → 0.18 so sub-pixel widths stop being pinched out by outline
  smoothing.
- Pencil width presets shifted to [0.5, 1, 2, 3, 6] and default
  pencil width dropped from 3 to 2 (DEFAULT_SETTINGS + persistence
  defaults).
- Canvas DPR floored at 2× on both LiveLayer and CommittedLayer so
  strokes stay crisp on classroom IFPs / external 96-DPI monitors
  that report devicePixelRatio = 1.

Arrow tool — original parametric geometry:
- Head length scales with both arrow length and stroke width, capped
  at 45% of total length, floored at 12px.
- Slender 0.72 aspect ratio + swept-back chevron notch read as a
  designed arrowhead instead of a flat-based triangle.
- Shaft ends at the notch, eliminating the round-cap blob that small
  arrows used to show at the join.

Toolbar UI — Phosphor-inspired refresh:
- All toolbar icons redrawn in the Phosphor Icons regular-weight
  visual language: 1.4 stroke (was 1.8), simpler geometry, no
  decorative ferrules / ribs. Fresh hand-drawn SVGs, not verbatim
  copies of any third-party set.
- Tool / action buttons tightened from 30×30 to 26×26.
- Color swatches: 18px circles → 22px rounded squares with 2px
  border; hover scale 1.15 → 1.08 for a calmer feel.
When macOS Screen Recording was previously denied, the screenshot
button silently did nothing — desktopCapturer.getSources() returned
an empty array, capture exited at the first null branch, the
renderer's 8-second IPC handler timed out, and the user had no
feedback. This change closes that loop end-to-end across all three
desktop OSes.

Permission flow:
- Main process never preflight-bails. On macOS 'granted' or
  'not-determined' the capture proceeds, letting the OS surface its
  native first-run prompt where applicable. On Wayland Linux the
  xdg-desktop-portal natively shows 'Allow once / Allow always / Deny'
  at call time — exactly the granular UX the OS gives us.
- On macOS 'denied', a new PermissionModal opens in the toolbar with
  'Open System Settings' (deep-links to Privacy & Security → Screen
  Recording) and 'Recheck' buttons. The modal auto-rechecks when the
  toolbar window regains focus, so granting in Settings + alt-tab
  back closes the modal and re-runs the pending capture automatically.
- New permissions:needed / permissions:status IPC channels broadcast
  state to every renderer.

Save destination:
- New persisted fields saveDir + alwaysAskSavePath flow through the
  hub like other settings.
- First save shows the OS save dialog (default ~/Pictures/Lekhini/).
  The chosen folder is remembered; subsequent saves auto-write to
  <folder>/lekhini-YYYY-MM-DD-HHMMSS.png with no dialog.
- Settings panel gains a 'File save' section: an 'Always ask where to
  save' toggle and a path button that opens a folder picker.
- fs.writeFileSync wrapped in try/catch; failures emit capture:error.

Renderer feedback:
- New Toast component (success / error / info) renders a stack
  bottom-right with a Reveal action that calls shell.showItemInFolder.
- Success path: 'Saved to ~/Pictures/Lekhini/lekhini-…png [Reveal]'.
- Error path: surfaces the underlying message; auto-closes at 6s.

Honest limits:
- macOS cannot programmatically re-prompt once denied — only the
  Settings-deep-link + focus-recheck pattern is available, which is
  what reputable Mac capture apps already do.
- Windows + X11 have no permission gate to surface; capture just
  works there. The Wayland portal owns its own UI.
PermissionModal and Toast render via solid-js/web's <Portal>, which
mounts them into document.body — outside the .bar element where all
the theme CSS variables (--bg, --text, --border, …) are defined. The
portaled content fell through to browser defaults (black text on
transparent), making the modal title and toast message effectively
invisible against the dark toolbar window background.

Two-line fix:
- Duplicate the existing 'dark' and 'light' theme token blocks to also
  apply on the body selector, so portal content inherits the same
  vars as .bar.
- Mirror hub().theme onto document.body.dataset.theme via createEffect
  so the body[data-theme=light] override fires when the user toggles
  light mode.

No JS API surface change, no behavioural change for non-portal UI.
…itlebar hint

The floating PermissionModal and Toast looked out of place inside the
tiny toolbar window — the modal cramped the small viewport and the
bottom-right toast was visually disconnected from the rest of the
chrome. Both replaced with UI that feels native to the toolbar:

- New status side panel reuses the .settings-panel dock + chrome.
  Slots into the same layout slot as Settings, with the same flex
  re-direction in vertical mode and the same auto-resize behaviour.
  Shows one of two views:
    * permission — explains screen-recording is off, with
      Open System Settings + Recheck buttons. Auto-rechecks when
      the toolbar window regains focus (typical path after the user
      toggles Lekhini on in Settings + alt-tabs back).
    * error — surfaces save errors with Pick new folder + Dismiss.
  Mutually exclusive with the Settings panel (settings takes priority;
  opening the status panel closes settings).

- Success path no longer pops a toast. Instead the existing titlebar
  hint area briefly shows 'Saved · ~/.../lekhini-…png' in gold with
  an underline; clicking it reveals the file in Finder/Explorer.
  Auto-clears after 4 seconds. Same mechanism the toolbar already
  uses for hover-hints, so it doesn't introduce a new visual concept.

- Deleted: PermissionModal.tsx, Toast.tsx, all .perm-* / .toast-*
  CSS, the body-theme-mirroring effect (only needed because of
  Portals, which are gone), and the body theme-token duplicates.
Two real bugs from the previous refactor:

1. Side panel rendered inside the 88px-wide v-mode toolbar window,
   so the user saw nothing happen when they clicked screenshot with
   permission denied — the panel was off-screen behind itself. The
   bar window in vertical mode only grows when main.ts's onChange
   sees settingsOpen flip; status-panel state was renderer-local so
   main never knew to call resizeToolbar.

   Fix: add transient `statusPanelOpen` to HubState. Renderer mirrors
   `panelKind() !== null` into it via a createEffect. main.ts treats
   either settingsOpen or statusPanelOpen as 'side panel open' for
   resize purposes. reportContentSize in the renderer (height
   measurement) now also accounts for the status panel. The two
   panels are mutually exclusive in the same dock slot — hub.patch
   force-closes settings when status opens.

2. permissions.ts onFocusRecheck called app.once('browser-window-focus',
   handler) on every invocation without clearing the prior listener.
   Every denied screenshot attempt stacked another handler; after 11
   clicks Node emitted MaxListenersExceededWarning. Fix: keep one
   activeFocusHandler reference; subsequent calls just swap the
   pending callback in-place rather than registering a new listener.
systemPreferences.getMediaAccessStatus('screen') returns a value the
OS caches per-process and lazily refreshes only on the next actual
capture-API call. After the user toggled Lekhini on under Privacy →
Screen Recording, the panel kept reporting 'Still off' because the
plain Recheck only re-read the same stale cache value.

New deepRecheck() in permissions.ts actually probes
desktopCapturer.getSources({ thumbnailSize: 1×1 }) — that API call
is what forces macOS to re-evaluate TCC and update the cache. If
the probe returns any sources, permission is real and we report
'granted'.

Wired up:
- New 'permissions:deep-recheck' IPC; renderer's Recheck button now
  uses it instead of the cached check().
- onFocusRecheck (the auto-recheck-on-focus-return path) also goes
  through deepRecheck, so alt-tabbing back from System Settings
  works without a manual click.
- notifyStatus() accepts an explicit override so renderers receive
  deepRecheck's fresh view, not the cache that may still say denied.
- New 'app:relaunch' IPC (app.relaunch() + app.exit(0)) and a
  'Relaunch' button in the permission panel as the escape hatch for
  the rare macOS cases where even the probe can't see the change.
- Panel hint updated: 'macOS sometimes only sees the change after a
  restart — click Relaunch'.
User reported that even after granting Screen Recording in macOS
System Settings, the panel kept showing 'Still off' on focus-return
auto-recheck. The console showed '[pen] deepRecheck capture probe
failed Failed to get sources.' — desktopCapturer.getSources() was
throwing outright, not just returning an empty array.

This is a Chromium-in-Electron quirk on macOS: when the process
starts with Screen Recording denied, Chromium initialises its capture
pipeline in 'can't capture' mode and there's no way to refresh it
from inside the running process. Only a restart recovers. Dev-mode
amplifies the issue because the running process bundle id is
'com.github.Electron', not the production 'com.opensourcebharat.lekhini',
so the TCC entry the user toggles may not even map to the process
that's checking.

Changes:
- deepRecheck() now returns { screen, probeError } — probeError is
  true when getSources() threw (vs returned no sources). That's the
  signal that the process is permanently stuck this session.
- notifyStatus + onStatus carry probeError end-to-end so the renderer
  knows the recovery story.
- Permission panel reshuffles when probeError = true:
    * 'Relaunch Lekhini' becomes the gold primary button.
    * 'Open System Settings' demotes to secondary.
    * Hint turns red-tinted and reads 'macOS can't refresh the
      permission for a running process — Click Relaunch'.
- New 'packaged' field on app:info; renderer shows a small italic
  footnote in dev mode explaining the quirk is dev-specific.
Three user-reported issues addressed in one pass since they're all in
the screenshot / snip surface:

Toolbar in saved screenshots:
- setContentProtection(true) on the toolbar window. macOS uses
  NSWindowSharingNone and Windows uses WDA_EXCLUDEFROMCAPTURE, so
  desktopCapturer (ours or any third-party recorder) skips it. The
  PNG now contains the underlying app + the user's annotations and
  nothing of Lekhini's chrome.

Snip drag border invisible:
- The snip tool was passing a generic 'region' item to setDraft, and
  drawRegion painted a 1px white dashed line — invisible against most
  desktops. Added an opt-in 'marchingAnts' flag on RegionShape and a
  two-pass black-underneath / white-on-top dashed render path keyed
  on the flag, mirroring the committed snip-selection style. The
  user-facing rectangle tool still gets its own coloured outline.

No action UX after snip completes:
- New SnipActions component renders a floating menu pinned to the
  bottom-right of the completed selection (auto-flips above the rect
  if it would clip the bottom edge of the display).
    [ Copy ]  [ Save ]  [ ✕ ]
- Copy → existing pen.snip.copy() (clipboard), then clears the
  selection. User pastes wherever.
- Save → existing pen.relay.screenshot(), which already respects the
  active selection and goes through the remember-folder save flow.
- ✕ → just clears.
- Menu is only mounted while drawMode is on AND the snip tool is the
  active tool — otherwise the overlay window is click-through and the
  menu would render but not respond. Hiding it keeps the screen clean.
After a snip action completes the user wants the overlay to step out
of the way — clicking Copy is almost always followed by ⌘V into
another app, and Save just means "I'm done with that selection."
Staying in drawMode meant the overlay kept intercepting the next
click and the user had to manually toggle pen mode off first.

After Copy or Save we now patch hub.drawMode=false. The snip tool
itself stays selected, so re-enabling drawMode (⌘⇧D or the status
dot) jumps straight into another selection without a tool change.
Cancel still just clears the rect without changing drawMode — the
user explicitly said 'never mind' but probably wants to keep drawing.
User asked for snip Copy/Save to feel quick. Roughly halves the
end-to-end latency from selection-release to clipboard-ready by
cutting the base64 round-trips and the heavier image-decode path.

End-to-end IPC is now Uint8Array all the way:
- main: desktopCapturer.thumbnail.toPNG() returns a Buffer; sent
  directly to the overlay via overlay:snip / overlay:screenshot as
  { png: Buffer }. No more toDataURL → base64 string → 33% larger
  payload → renderer base64 decode → HTMLImageElement.src round-trip.
- renderer: createImageBitmap(blob) decodes off the main thread and
  is materially faster than 'new Image; img.src = dataURL' for big
  full-screen PNGs. Composite uses the bitmap, then canvas.toBlob
  (async) → arrayBuffer for the result PNG.
- main: receives Uint8Array from capture:snip:result /
  capture:screenshot:result, Buffer.from(uint8) — skips the base64
  decode entirely. Clipboard write and fs.writeFile both work
  straight from the Buffer.

Snip Copy UX:
- Button shows 'Copying…' and disables itself + the others while the
  await is in flight, so the user can't double-click and knows
  something is happening. Save remains fire-and-forget (it goes
  through the save dialog or remembered folder and confirms via the
  toolbar's gold reveal hint).

PNG format unchanged — still lossless, still universally
clipboard-compatible. The savings come from skipping the base64
detour, not from quality loss.
Single source-of-truth path. The user drops their PNG at build/icon.png
and it flows to:

- App icon (macOS .icns, Windows .ico, Linux PNG) — electron-builder
  auto-generates platform formats from the PNG. Made explicit in
  electron-builder.yml under the top-level `icon` field.
- In-app Logo() component (toolbar titlebar, vertical brand strip,
  about card, minimized pill) — replaces the existing inline SVG
  with an <img> tag pointing at the PNG.
- README header — centered logo above the title via an HTML <p>
  block; renders on GitHub once the file is committed.

Build doesn't break before the file exists. icons.tsx uses
import.meta.glob (added vite/client to tsconfig types so TS knows
about it), which returns an empty object when the file is missing.
In that state the existing gold-pencil SVG renders as fallback so
the toolbar still has a logo while the user prepares the real one.

CSS widened to target both `svg` and `img` inside the .v-brand,
.about-logo, and .mini-logo wrappers so the larger logo variants
size correctly regardless of which renders.

Recommended file format: 1024×1024 PNG, transparent background,
sRGB.
The titlebar was crowded — logo, status-dot, hint text, settings,
plus window controls. The hint and settings are also things users
glance at rarely, so they don't deserve the prime top-center slot.

Layout now:
- Titlebar (top): logo (enlarged) + window controls only.
- Tools + actions (middle): unchanged.
- Color swatches (v-mode pinned strip): unchanged.
- NEW bar-footer at the bottom:
    [ hover-hint text · · · · · ] [ status-dot ] [ settings ]
  Hover-hint flexes left, dot + settings pinned right. In v-mode the
  footer stacks (hint above, controls below) since the 88px-wide bar
  can't fit both in a row.

Logo treatment:
- h-mode logo: 22 → 32px with a subtle scale-on-hover transition and
  7px border-radius (looks cleaner with the user's PNG logo).
- v-mode logo: 24 → 40px now that the v-brand strip carries only the
  logo (settings + dot moved out).
- Same hover transition in both modes.

Reveal-on-click (gold underlined hint after a saved screenshot) and
the live tool-name fallback both now live in the footer's hint slot.
Helpers consolidated — brandLine() and vertHintLine() merged into a
single footerHintLine().
Two issues with the minimized state:

1. Restore was unreliable. .mini had -webkit-app-region: drag covering
   the whole 56px pill, with only a 30px no-drag .mini-logo span
   inside. Click events on drag regions are eaten by Electron, so
   only clicks landing exactly on the small logo restored — clicking
   anywhere else on the pill did nothing.

2. Logo rendered awkwardly — 30px image inside the pill plus a 22×3
   pseudo-element bar at top:5 (an old drag-indicator hint that
   doesn't actually drag). The user described it as 'half logo'.

Fix:
- .mini-logo is now a button covering the whole pill except a 5px
  drag border around the edge — click anywhere except the border
  restores, while the border still drags the window.
- The misleading ::after horizontal bar is gone.
- Logo bumped 30→36px, object-fit: contain, pointer-events: none on
  the image so the parent button owns clicks (no aspect-ratio
  distortion if the PNG isn't perfectly square).
- Subtle hover background + scale on the inner button so the click
  target is obvious.
- MIN_SIZE 56→64 so the 36px logo + drag border has breathing room.
Two issues:

1. The Phosphor-refresh pass made pen and pencil look almost identical
   — same diagonal-pencil silhouette with a small ferrule line. Added
   distinguishing details that read at 22 px:
     - Pencil gets a filled rectangular eraser cap at the wide end.
     - Pen gets a filled triangular nib protruding at the writing tip
       and a slightly bulkier barrel.

2. After collapsing and restoring, the bottom footer was clipped
   below the window edge until the next state change (tool selection,
   theme toggle, etc.) triggered reportContentSize to remeasure.
   Root cause: TOOLBAR_SIZES (the static post-restore window size in
   shared/constants.ts) was set before the footer existed and was
   shorter than the actual content. The renderer eventually catches
   up via setContentSize, but until then the footer sits outside the
   visible window. Fix two ways:
     - Bump TOOLBAR_SIZES (h.h 102→140, v.h 480→560) to cover the
       worst-case content height including the footer, so the window
       is generous from the moment it restores.
     - Add a second RAF after the first inside the resize effect, so
       transitions that unmount+remount bar-main (notably restore
       from minimised) get a follow-up measurement on the frame
       after layout fully settles. The window can shrink past these
       generous defaults via setContentSize — they're a floor, not
       a target.
Foundation layer for the upcoming Ask-AI feature. Renderer UI lands
in follow-up commits.

What's in:

Provider abstraction (src/main/ai/):
- types.ts — ProviderAdapter interface (async-iterable text deltas),
  AbortSignal honoured for cancellation.
- registry.ts — adapter lookup + vision-capable model dropdown
  options + console URLs for the user to grab a key.
- anthropic.ts — Claude via @anthropic-ai/sdk, messages.stream with
  base64 image on first user turn; PNG mime coerced into the SDK's
  literal union.
- openai.ts — ChatGPT via openai SDK, chat.completions.create stream
  with image_url data URL.
- gemini.ts — Gemini via @google/generative-ai, generateContentStream
  with inlineData part.

Credentials (src/main/ai/credentials.ts):
- safeStorage-encrypted API keys, ciphertext in
  <userData>/ai-credentials.json with 0600 perms.
- Falls back to in-memory store with a console warning if
  safeStorage.isEncryptionAvailable() is false (rare).
- Keys NEVER appear in PersistedState — renderer only sees a
  configured: true boolean.

IPC layer (src/main/ai/ipc.ts):
- ai:set-key, ai:delete-key, ai:get-status,
  ai:test-connection (1-char probe), ai:ask (returns requestId,
  streams ai:chunk events), ai:cancel (AbortController per request).

Shared/persistence/hub plumbing:
- New ProviderId, AiStatus, ChatTurn, AskInput, StreamChunk,
  ConnectionTestResult types.
- HubStateUpdate gains chatOpen, aiActiveProvider, aiActiveModel,
  aiProfilePrompts. PersistedState mirrors the three persistent
  fields (chatOpen is transient).
- Hub patch handler enforces mutual exclusion between settings,
  status panel, AND the new chat panel (all share the dock slot).
- main.ts onChange resizes the toolbar window when chatOpen flips,
  same path as settingsOpen/statusPanelOpen.

Profile prompts (src/shared/profiles.ts):
- Profile interface gains aiPrompt: string.
- Trader → chart-analysis prompt; Teacher → student-explainer;
  General → concise observation. resolveAiPrompt() helper picks the
  user override from hub.aiProfilePrompts when present, else falls
  back to the profile default.

Preload + penApi.d.ts:
- pen.ai.{setKey, deleteKey, getStatus, testConnection, ask,
  cancel, onChunk} exposed to the renderer.

Renderer UI (Settings AI section, ChatPanel, SnipActions button,
CSS) follows in subsequent commits.
Renderer UI half of the AI integration. Wires the scaffolding from
the previous commit into the user-facing surface.

Settings → AI section (toolbar Settings panel):
- Provider dropdown (Anthropic / OpenAI / Gemini, marked "configured"
  when a key is saved).
- Model dropdown filtered by provider.
- Masked API key input + Save / Test / Delete buttons. Test runs a
  1-char probe with cancel-on-first-chunk to confirm round-trip
  without burning tokens.
- ● Configured badge + last-test result (ok / fail with latency).
- "Get a key →" link to the provider's console.
- Per-profile prompt textareas (Trader / Teacher / General) with
  "Reset" links that restore the built-in default.
- Disclosure paragraph: images go directly to the chosen provider;
  Lekhini doesn't log or proxy.

ChatPanel (new component, src/renderer/toolbar/ChatPanel.tsx):
- Reuses the .settings-panel chrome — docks in the same slot as
  Settings and the Status panel (mutex enforced in hub.patch).
- Header with provider · model badge and close button.
- Image thumbnail of the snip (Blob → object URL).
- Streaming message list with user / assistant bubbles. Assistant
  responses render through `marked` for proper markdown (code blocks,
  lists, bold/italic, links).
- Composer textarea with Send (⌘↩) and Esc-to-close. Send disables
  while a stream is in flight; replaces itself with a red Cancel
  button that calls pen.ai.cancel(requestId).

SnipActions Ask AI button (overlay):
- New 'Ask AI' button between Save and ✕ — only rendered when
  hub.aiActiveProvider is non-null (i.e. user has configured a key
  and we know which provider to use).
- Menu widens from 168 to 232 px to fit the extra button without
  wrapping.
- Click → pen.snip.askAi(profile). Main captures + composites the
  snip (same path as Save/Copy), then broadcasts chat:session to all
  renderers. Toolbar's ChatPanel picks it up and auto-fires the
  first AI turn.
- Drawmode stays on (unlike Save/Copy) — the chat is in the toolbar
  window so the overlay can stay interactive for more snipping.

Cross-window flow:
- New 'snip:ask-ai' IPC (renderer → main) and 'chat:start' IPC
  (alternative renderer → main with raw bytes; not used by
  SnipActions but available for future entry points).
- 'chat:session' broadcast (main → all renderers) carries
  { sessionId, png, mime, profile }.
- startChatSession() helper exported from ai/ipc.ts; used by both
  the chat:start IPC handler and capture.ts's askAiAboutFocusedSnip.

CSS:
- .chat-panel, .chat-thumb-wrap, .chat-bubble-user/.assistant,
  .chat-typing animation, .chat-markdown tight typography for the
  narrow panel, .chat-composer with auto-grow textarea.
- .ai-section settings styles: .ai-select (native dropdown styled),
  .ai-key-input (monospace masked), .ai-prompt-textarea,
  .ai-badge-configured (green pill), .ai-test-result.ok/.fail,
  .ai-disclosure italic footnote.
- Overlay's snip-action-ai button: muted purple accent so it reads
  as a different class than Copy/Save.
Conflicts resolved in favor of this branch (ours) across the AI
integration, follow-up context, auto-update, and release-automation
work. Non-conflicting changes from main are kept.
@opensourcebharat opensourcebharat merged commit 3a80b1f into main May 29, 2026
3 checks passed
@opensourcebharat opensourcebharat deleted the feat/ai-integration branch May 29, 2026 21:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant