Skip to content

Demo mantis on mantis#8

Merged
LucaVor merged 22 commits into
mainfrom
demo-mantis-on-mantis
Jun 20, 2026
Merged

Demo mantis on mantis#8
LucaVor merged 22 commits into
mainfrom
demo-mantis-on-mantis

Conversation

@LucaVor

@LucaVor LucaVor commented Jun 20, 2026

Copy link
Copy Markdown
Collaborator

This pull request introduces "Repo Radar," a comprehensive, headless, and scriptable intelligence tool for the Mantis project. It automates the creation of a weekly briefing by aggregating data from GitHub repositories and meeting notes, performing notebook-based delta analyses, synthesizing insights with an agent, and assembling the results into a markdown report. The workflow is designed to degrade gracefully if any stack is unavailable, and its automation covers use cases that the Mantis UI cannot handle.

Key additions and improvements include:

New End-to-End Intelligence Workflow:

  • Added repo_radar.py, a script that orchestrates four phases: (1) building a portfolio of maps from project data, (2) running notebook delta analyses to track changes week-over-week, (3) synthesizing a briefing using an agent, and (4) assembling a markdown report with results and charts. The script is headless, scriptable, and supports scheduled runs.

Documentation:

  • Introduced a detailed README.md for Repo Radar, explaining its purpose, the four-phase workflow, why it cannot be replicated in the Mantis UI, requirements, and instructions for running and configuring the tool via environment variables.

LucaVor and others added 21 commits June 16, 2026 15:09
…m live testing

A flagship example that exercises most of the SDK end-to-end against the project's own data,
and two real agent-runtime fixes surfaced by running it live.

examples/repo_radar/:
- sources.py: pull PRs/issues (GitHub REST, both repos), a git contributor rollup, and the
  meeting-notes Google Doc into DataFrames (verified live: 1000 PRs, 306 issues, 27 authors,
  48 notes). plain ingestion, runnable standalone.
- repo_radar.py: 4 phases — build a portfolio of maps (spaces.create), per-map notebook delta
  analysis with checkpoints (week-over-week), provider-scoped agent synthesis, and a markdown
  briefing. phases degrade gracefully; defaults to direct-backend (base_url="").
- README documenting why it's impossible in the single-space UI.

agent fixes (found while smoke-testing against the live stack):
- websockets connect: support both extra_headers (<=13) and additional_headers (>=14).
- normalize the agent runtime's untyped assistant frames ({sender:"ai", message, partial});
  only the final (partial=false) frame becomes text. handle `heartbeat` like typing.
- ask() now uses an IDLE timeout (resets on each event/heartbeat) instead of a hard total —
  long claude_code/opencode runs are heartbeat-punctuated.
- agents.session(all_spaces=…, mode=…): send agent_initialization on connect; the ack is
  best-effort (delivered via channel-layer group broadcast that may not reach a headless
  socket), so we proceed after a grace window rather than failing.

Verified live: 3/4 maps created from real data; notebook cells ran in-kernel; a claude_code
agent streamed init→heartbeats→complete and the SDK surfaced its real error text
("Could not load credentials") into the brief. Remaining gaps are stack model credentials
(Bedrock/OpenAI), not SDK code. 55 unit tests pass, ruff clean.
… space

The agent's MCP tools (inspect/search/bags/points) require an X-Space-State-ID header to
know which space/map to act on. The composer only sets that header when ws/chat is given a
space_state_id — which the browser mints when you open a space, but a headless SDK session
never had. Result: the agent could talk but not inspect ("mantis://current ... failed").

Fix:
- new client.space_states resource: create()/list() over POST/GET /api/space-state/ — the
  same cookie-auth endpoint the frontend uses (verified live; the /api/v1/me/ API-key variant
  needs a separate key the SDK doesn't have, so cookie endpoint is the right choice).
- agents.session(space_id=...) now auto-mints a space-state (auto_space_state=True, default)
  and threads &space_state_id= onto the ws/chat connect; pass space_state_id= to reuse one, or
  auto_space_state=False to skip. all_spaces sessions don't mint (no single space to scope).

Verified live: opencode agent scoped to a space now inspects it and reports real content
("The 'Mantis Radar — issues' space centers on software repository normalization, ...") where
before it failed the tool call. 59 unit tests pass, ruff clean.
SDK:
- client.aliases: resolve()/get()/set() over /api/{getSpaceFromAlias,getAliasFromSpaceId,
  setSpaceAlias}/, plus resolve_or_create_space(alias) — the idempotency guardrail: reuse the
  space if the alias resolves, else mint a DETERMINISTIC uuid5 space id so concurrent first
  runs converge instead of racing.
- spaces.create() now accepts explicit space_id + map_id, so a map can be created INTO an
  existing space and refreshed in place on re-runs (backend get_or_creates the space and
  updates the map of that id — verified the serializer honors both).

Repo Radar reworked to the single-space pattern: maintain ONE aliased space (/space/m4m)
holding the 4 radar maps, each with a stable uuid5 map_id, aliased once on first run.

Verified live end-to-end:
- run 1: created space + alias m4m-radar, 3 maps upserted (prs 400'd on a data edge — handled).
- run 2: REUSED the same space (not new), identical map ids, no alias error.
- space holds 3 maps after 2 runs (not 6) — no duplication, no orphans.
Depends on the MantisAPI `alias-idempotency` branch for safe alias re-set (the guardrail
also avoids needing it). 65 unit tests pass, ruff clean.
A wrong MANTISAPI_PATH made github_authors raise CalledProcessError, which the build_maps
loop didn't catch (it only caught MantisError) → the whole run crashed instead of skipping
that one map. Now github_authors raises a clear ValueError when the path isn't a git repo,
and build_maps catches any per-source exception so one bad source never aborts the run.
…eate

Two issues surfaced running the demo:

1. PRs map 400'd ("Invalid request data"): the only semantic column was the PR body, and
   ~7/12 PRs have an empty body, so the backend had nothing to embed. Fall back to the PR
   title (always present, and the richest signal) when the body is empty; same defensive
   fallback for issues. Verified: 0/12 empty summaries now.

2. Agent synthesis 500'd on POST /api/space-state/: space-state is unique on
   (space, name, created_by), so re-running for the same space/user blindly re-POSTing the
   same name hit an integrity error. space_states.create() is now get-or-create — it lists
   and reuses an existing state of that name before POSTing, making it idempotent across runs.

66 unit tests pass, ruff clean.
…opped

The agent run produced its full answer but then dangled for the entire idle timeout: the
backend sends the committed final text and *then* a chat_complete envelope over a kafka→socket
bridge that sometimes drops the envelope, leaving ask() blocked on recv(). Now, once we see the
committed final ai frame (sender=ai, partial=false), we wait only a short grace (8s) for a
terminal event before finishing cleanly — so the run ends right after the answer instead of
hanging ~90s. final_grace is instance-overridable for tests.
…hors, per-speaker notes

spaces.create now sends map_name (defaults to space_name) — without it the backend names
every map "Untitled Map". repo_radar passes a friendly title per map.

radar sources now match the demo's intent:
  - github_prs / github_issues: ALL active (open) items across both repos, each carrying
    created_at + updated_at date facets (was "all" states, single date).
  - github_authors: everyone who has EVER committed to either repo, via the GitHub API
    (/contributors for the complete roster + a bounded /commits pass for subjects and latest
    date) — drops the local-clone dependency and the MANTISAPI_PATH tilde footgun entirely.
  - meeting_notes: one point PER SPEAKER SEGMENT (829 across 35 meetings) with speakers,
    timestamp, meeting, and date facets, instead of one blob per meeting.
…content)

Walks both repos via the Git Trees API, fetches the first 1500 chars of
each source file, and creates a new "code" map in the space so the
codebase is navigable alongside PRs/issues/authors/notes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…atch all errors

- Code source: focus on .py/.ts/.tsx/.js only, skip tests, sort by size,
  cap at 800 files to keep embedding under 10min
- Move code map last so smaller maps don't queue behind it
- Increase stall_timeout to 1800s and cell execution to 300s
- Phase 3: catch any exception (not just MantisError) so a WebSocket
  failure doesn't crash the whole script

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
In a multi-map space, the kernel's `maps` list contains all maps. The
delta code was always reading maps[0] regardless of which map the
notebook was created for, so every map reported the same point count.

Now templates the target map_id into the cell code and looks it up by id.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Python's UUID.__eq__ returns False when compared to a plain string.
The kernel stores map_id as a UUID object, so str(m.map_id) is needed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- New client.featured_chat resource: .set(), .get(), .clear(), .clone()
- repo_radar phase 3 now pins the synthesis conversation so visitors
  see the briefing by default when they open the space
…pinning

The SDK was using its local chat_id (sdk-{uuid}) which is just the WS
routing key. Agno generates its own session_id server-side and sends
it back in message frames as 'chat_id'. Now captured as server_chat_id.
…session

The backend only persists to Agno when it generates the session_id
itself (chat_id='new' triggers uuid generation). Passing 'sdk-{uuid}'
meant the chat was never stored, so featured chat pinning/cloning
couldn't find it later.
Agno persists sessions asynchronously at the end of arun(). If the SDK
closes the WebSocket immediately after getting the final text, the
disconnect cancels the persistence. A 2s delay gives Agno time to
flush.
Agno doesn't reliably persist WS sessions. Now the SDK sends the
conversation messages along with the pin request so the backend can
create the chat row directly if needed.
…rove prompt

- Increased commit scan from 50 to 500 (5 pages) so most active files
  get author/date metadata instead of 'unknown'
- Skip code map in notebook delta (rebuilds last, kernel sees 0 points)
- Restructured synthesis prompt for a scannable team digest format

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the 'Repo Radar' example tool, which automates the creation of a portfolio of maps, performs notebook delta analysis, and uses an agent to synthesize a weekly briefing. To support this, the Mantis SDK is extended with new resources for space states, aliases, and featured chats, alongside enhancements to the agent session handling. Feedback from the code review highlights several critical improvements, including handling missing environment variables gracefully to prevent crashes, optimizing sequential HTTP requests to avoid rate limits, removing hardcoded sleep delays, robustly checking websocket signatures instead of catching generic TypeErrors, and ensuring the portability of generated markdown briefings and charts by using relative paths.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

the agent is scoped to the one m4m space (it auto-mints a space-state so its MCP tools can
inspect the maps). all_spaces mode would reason across every accessible space instead."""
print(f"\n=== PHASE 3: agent synthesis (provider={provider}, all_spaces={USE_ALL_SPACES}) ===")
email = os.environ["MANTIS_USER_EMAIL"]

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Using os.environ["MANTIS_USER_EMAIL"] directly outside of the try block will raise a KeyError and crash the entire script if the environment variable is not set. Since this workflow is designed to degrade gracefully when components or credentials are missing, it is safer to use os.getenv and handle the missing email gracefully.

    email = os.getenv("MANTIS_USER_EMAIL")
    if not email:
        return "_(agent synthesis unavailable: MANTIS_USER_EMAIL is not set)_", None, None

Comment on lines +236 to +248
for i, commit in enumerate(commits):
sha = commit.get("sha")
login = (commit.get("author") or {}).get("login") or "unknown"
date = (commit.get("commit", {}).get("author", {}).get("date") or "")[:10]
if not sha:
continue
try:
detail = requests.get(
f"{_API}/repos/{repo}/commits/{sha}",
headers=_gh_headers(), timeout=15,
).json()
except Exception:
continue

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This loop performs a synchronous HTTP request for every single commit returned by _paginate (up to 500 commits per repository, across multiple repositories). This can result in up to 1,500 sequential HTTP requests, which will be extremely slow (taking several minutes) and will likely trigger GitHub's rate limits or abuse detection. Consider parallelizing these requests using a ThreadPoolExecutor, or reducing the default max_pages to a smaller number (e.g., 1 or 2), or using the GitHub GraphQL API to fetch commit file details in bulk.

Comment thread mantis_sdk/agents.py
Comment on lines +201 to +203
import asyncio
# allow Agno time to persist the session before the WS disconnects
await asyncio.sleep(2)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Hardcoding a 2-second sleep in the close() method of AgentSession introduces a significant and blocking delay every time a session is closed. This is particularly problematic in automated workflows, tests, or loops where sessions are frequently opened and closed. If the backend requires time to persist the session, this synchronization should ideally be handled via a proper protocol message/acknowledgment from the server, or the sleep duration should be configurable rather than hardcoded.

Comment thread mantis_sdk/agents.py
Comment on lines +151 to +160
# the header kwarg was renamed extra_headers → additional_headers in websockets 14.
# try the new name, fall back to the old so we work across the pinned range (>=10.4).
try:
self._ws = await websockets.connect(
self._ws_url(), additional_headers=headers, max_size=None, open_timeout=self.timeout,
)
except TypeError:
self._ws = await websockets.connect(
self._ws_url(), extra_headers=headers, max_size=None, open_timeout=self.timeout,
)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Catching a generic TypeError on the websockets.connect call to handle library version differences can be risky. If websockets.connect raises a TypeError due to any other argument mismatch or internal bug, it will silently fall back to the except block and attempt a second connection, which might fail with a confusing error or mask the real issue. A more robust approach is to inspect the signature of websockets.connect or check the websockets version beforehand to determine the correct keyword argument.

Suggested change
# the header kwarg was renamed extra_headers → additional_headers in websockets 14.
# try the new name, fall back to the old so we work across the pinned range (>=10.4).
try:
self._ws = await websockets.connect(
self._ws_url(), additional_headers=headers, max_size=None, open_timeout=self.timeout,
)
except TypeError:
self._ws = await websockets.connect(
self._ws_url(), extra_headers=headers, max_size=None, open_timeout=self.timeout,
)
import inspect
import websockets
connect_kwargs = {
"max_size": None,
"open_timeout": self.timeout,
}
sig = inspect.signature(websockets.connect)
if "additional_headers" in sig.parameters:
connect_kwargs["additional_headers"] = headers
else:
connect_kwargs["extra_headers"] = headers
self._ws = await websockets.connect(self._ws_url(), **connect_kwargs)

'N. Title - Speakers (timestamp): discussion…'. we emit one row per segment so the map
captures who said what, when — the body is semantic; speakers/meeting/date are facets."""
url = f"https://docs.google.com/document/d/{gdoc_id}/export?format=txt"
text = requests.get(url, timeout=30).text

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Accessing .text directly on the response of requests.get without calling raise_for_status() can lead to silent failures. If the Google Doc is private, the ID is invalid, or the request fails, the server might return an HTML error page. The regex parsing will then silently fail to find any meetings, returning an empty DataFrame instead of raising an error. Calling raise_for_status() ensures that any HTTP errors are caught immediately.

Suggested change
text = requests.get(url, timeout=30).text
resp = requests.get(url, timeout=30)
resp.raise_for_status()
text = resp.text

Comment on lines +199 to +200
chart_png = "/tmp/repo_radar_contributors.png"
Path(chart_png).write_bytes(png)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Writing the chart image to a hardcoded /tmp/repo_radar_contributors.png path makes the generated markdown briefing less portable. If REPO_RADAR_BRIEF is configured to write to a different directory, the markdown will still reference the absolute /tmp path, which won't render if the files are moved or viewed on another machine. Writing the chart to the same directory as the briefing file and using a relative path in the markdown makes the report fully self-contained and portable.

Suggested change
chart_png = "/tmp/repo_radar_contributors.png"
Path(chart_png).write_bytes(png)
brief_dir = Path(os.getenv("REPO_RADAR_BRIEF", "/tmp/repo_radar_brief.md")).parent
chart_png = str(brief_dir / "repo_radar_contributors.png")
Path(chart_png).write_bytes(png)

lines += ["", "## Synthesis", synthesis or "_n/a_"]
chart = metrics.get("_chart")
if chart:
lines += ["", f"![contributors]({chart})"]

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To ensure the markdown briefing is portable and can render the chart image correctly when moved or shared, use the relative filename of the chart instead of its absolute path.

Suggested change
lines += ["", f"![contributors]({chart})"]
lines += ["", f"![contributors]({Path(chart).name})"]

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7cf9927f77

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

for name, map_id in notebook_maps.items():
try:
nb = client.notebooks.from_map(map_id, name=f"radar-{name}",
user_id=os.getenv("MANTIS_USER_EMAIL"))

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Do not pass email as notebook user_id

When running the documented cookie + MANTIS_USER_EMAIL setup, this passes the email address into NotebooksResource.create as user_id; that API sends user_id to /api/notebook/create and documents/defaults it as config.internal_user_id (mantis_sdk/notebook.py:205-214). On deployments expecting the backend user UUID, each notebook create/session call fails validation, so phase 2 produces no deltas or chart even though map creation succeeded. Use the configured internal user id (or require MANTIS_INTERNAL_USER_ID) instead of the agent email.

Useful? React with 👍 / 👎.

Comment thread mantis_sdk/resources.py
Comment on lines +443 to +444
except MantisError:
return None # backend returns 400 when not found; treat as "no such alias"

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Do not treat all alias errors as misses

If /api/getSpaceFromAlias returns auth, permission, server, or connection errors (for example an expired cookie or a transient backend failure), resolve() converts them to None, so resolve_or_create_space() treats the alias as absent and proceeds with a deterministic new space id. That hides the real failure and can create or update the wrong radar space once the caller continues. Only swallow the specific not-found/400 response and propagate other MantisErrors.

Useful? React with 👍 / 👎.

@LucaVor LucaVor merged commit 411dc57 into main Jun 20, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant