Skip to content

Add gemini-local bridge topos and MacBook-Pro-2.local configs#6

Open
lyonsno wants to merge 24 commits intomainfrom
cc/slime-0331
Open

Add gemini-local bridge topos and MacBook-Pro-2.local configs#6
lyonsno wants to merge 24 commits intomainfrom
cc/slime-0331

Conversation

@lyonsno
Copy link
Copy Markdown
Owner

@lyonsno lyonsno commented Apr 2, 2026

Captured current session topos and box-specific configurations for the local Gemini model bridge.

lyonsno and others added 24 commits March 31, 2026 10:02
Handle tool_call events in streaming loops and accumulate assistant
content across all rounds. Added a 'tool…' visual progress indicator
to the overlay and ensured final completion handles the concatenated
response. Added TestToolState suite for verification.

Co-Authored-By: Gemini CLI 1.0 (abbdb98d-5a95-4e39-bd74-e22fd063398e) <noreply@google.com>
mlx_audio already supports Kokoro, Qwen3-TTS, VibeVoice, and Irodori
through the same load()/generate() interface.  The only thing preventing
Spoke from using them was that speak() hardcoded Voxtral-specific kwargs
(temperature, top_k, top_p) in the generate() call.

Add _generate_kwargs() which introspects the model's generate() signature
and forwards only the params it declares.  Models with **kwargs still get
everything; models with strict signatures (Kokoro, Irodori) only get what
they accept.

To switch models, set SPOKE_TTS_MODEL to any mlx-community model ID:
  - mlx-community/Kokoro-82M-bf16 (82M, runs anywhere)
  - mlx-community/Qwen3-TTS-12Hz-1.7B-CustomVoice-8bit (voice cloning)
  - mlx-community/Qwen3-TTS-12Hz-0.6B-CustomVoice-8bit (lighter)
  - mlx-community/VibeVoice-Realtime-0.5B-fp16 (real-time, sidecar)
  - mlx-community/Voxtral-4B-TTS-2603-mlx-4bit (current default)

Also removes the Voxtral-specific import guard in tts_load() — mlx_audio
handles model dispatch internally.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds a "TTS" submenu to the menubar model picker with five models:
- Voxtral 4B (current default)
- Qwen3-TTS 1.7B CustomVoice
- Qwen3-TTS 0.6B CustomVoice
- VibeVoice 0.5B Realtime
- Kokoro 82M

Selection is persisted to model_preferences.json and triggers a
relaunch (same pattern as Whisper/assistant model switching).
The saved preference overrides SPOKE_TTS_MODEL env var at startup.

The TTS submenu only appears when TTS is enabled (SPOKE_TTS_VOICE set).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Enter was passing through to the underlying app when the assistant
overlay was showing, causing accidental keystrokes. The input tap
only suppressed Enter during tray mode or LATCHED state, but the
command overlay sits in IDLE state.

Add a command_overlay_active flag on the detector, set when the
overlay shows (command response, recall, error), cleared when it's
dismissed (cancel_dismiss, new hold start). Enter keyDown events
are suppressed while the flag is set.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Instead of requiring a full spacebar hold to dismiss the assistant
overlay, trigger cancel_dismiss() on the first spacebar keyDown
when the overlay is visible. The event tap fires the dismiss callback
which marshals to the main thread for the AppKit animation. Also
cancels any in-flight TTS playback.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The empty-recording path was re-dismissing the overlay even after
the instant-press handler already triggered cancel_dismiss(), because
_visible stays True during the dismiss animation. Now the empty-
recording dismiss is gated on command_overlay_active still being True,
which the instant-press handler already cleared.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Symmetric with instant dismiss: when the command overlay is not
visible and the assistant has history, an empty spacebar tap (hold
too short to produce audio) recalls the last response. No Enter
key required — the old Enter+spacebar timing was always finicky.

Flow is now: tap to dismiss, tap again to recall, hold to dictate.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Every command_overlay_active set/clear now logs with the reason
(hold start, dismiss, recall, command started, etc.) so we can
trace the bad state where the overlay stops appearing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The Enter+empty and plain-empty paths were checking _visible
(animation state) to decide dismiss vs recall. During the 750ms
dismiss animation, _visible stays True, so a quick second tap
would re-dismiss instead of recalling.

Now both paths use command_overlay_active (our flag) which is
cleared instantly on dismiss and set on recall/show. This makes
the toggle reliable regardless of animation timing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
A spacebar tap was causing dismiss (on keyDown) then immediate
recall (on empty-recording path) within the same gesture. Add a
_command_overlay_just_dismissed flag set by the instant dismiss
handler and checked by the empty-recording recall path. The flag
is cleared at the start of the next hold, so the next tap will
recall normally.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The delegate test fixture used MagicMock() for the detector, which
meant command_overlay_active was always truthy (a MagicMock object).
This masked bugs where the flag wasn't being checked or set correctly.

Fix _make_delegate to initialize real boolean values for
command_overlay_active and _command_overlay_just_dismissed.

Add TestCommandOverlayDismissRecallCycle with 7 tests covering:
- Enter+empty recall when overlay not active
- Enter+empty dismiss when overlay active
- Full dismiss→recall cycle (the bug that kept recurring)
- Stutter prevention (_just_dismissed blocks same-tap recall)
- Fresh tap recall works
- Hold start clears flags
- commandUtteranceReady_ sets overlay active

Fix existing test_short_shift_enter_hold_dismisses_visible_command_overlay
to set command_overlay_active=True (the authoritative flag) instead of
only _visible=True (the animation state).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When Enter is held during a spacebar tap that's too quick for
RECORDING state, the WAITING release was forwarding a space
character instead of calling _on_hold_end. This meant Enter+tap
only worked for recall if the spacebar was held 400ms+ (past the
hold threshold).

Now Enter+quick-tap routes through _on_hold_end(enter_held=True)
like the shift+quick-tap already does, making recall responsive
regardless of hold duration.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
set_response_text was calling set_utterance() then append_token(), which
triggered _update_layout twice: once with only the utterance text (shrinking
the window back to minimum height) then again with the full response (growing
it back). On every commandComplete_ the window visibly flickered between sizes.

Rebuild the utterance + separator + response attributed string in one shot
so _update_layout is called exactly once. Also adds NSMutableAttributedString,
NSShadow, and the string attribute name constants to the fake AppKit in
conftest.py so the single-layout contract can be tested without GUI runtime.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ce. Preserved ridged multifractal and warped tendrils.
…rved ridged multifractal and rational falloff. Added Toxic Green/Hot Pink debug noise layers and 80% peaked mid-range dimming.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: bfea4348e4

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +1622 to +1623
"commandToolStart:", {"token": token}, False
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P0 Badge Remove malformed indented line in command stream handler

The tool-call branch in _send_text_as_command contains a stray indented string/call fragment, which causes spoke/__main__.py to fail parsing with an IndentationError before the app can start. This is a universal startup failure (not input-dependent): importing the module fails, so none of the runtime pathways can execute.

Useful? React with 👍 / 👎.

continue
seen.add(model_id)
options.append((model_id, model_id, True))
options.append((model_id, model_id, model_id == selected_model))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Keep assistant menu items enabled when building options

This tuple’s third field is consumed as an enabled flag by MenuBarIcon._build_choice_submenu_item, but here it is set to model_id == selected_model. That disables every non-selected assistant model in the menu, so users cannot switch models from the UI after refresh (and the same pattern is used in seeding).

Useful? React with 👍 / 👎.

@lyonsno lyonsno force-pushed the main branch 2 times, most recently from 0b03228 to c5d587d Compare April 4, 2026 23:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant