Revive sdk#6
Conversation
Fix drift against the current MantisAPI and add a typed, resource-oriented surface.
Fixes:
- create_space: drop the removed parameters/select-umap step; poll
synthesis/progress/<map_id>/ and read {map_id, space_id} (no layer_id)
- DataType.CustomModel -> "custom_model" (was silently dropped by DRF);
add image/geospatial/vector to match the backend serializer
- get_annotations: use REST /api/getAnnotations/ (ws/space was removed)
- getClusterQuestions: raise FeatureUnavailableError (route disabled)
- remove a real session cookie that had been committed in main.py
New:
- resource-oriented client (spaces/maps/notebooks/annotations/search) with
legacy flat methods kept as deprecating shims; from_env() + internal-service auth
- notebook subsystem (resolve_map_to_project, create, add_cell, Cell.execute,
text/image_png_bytes) plus checkpoints and dill export
- spaces.from_github() + scaffolds for molecules/h5ad/embed-only
- expanded Space browser commands + generic command() escape hatch + async CM
- StrEnum types + py.typed, MantisError exception hierarchy, requests.Session
transport with retry/backoff/timeouts and cookie-redacted logging
- pyproject.toml with [browser]/[dev] extras, CHANGELOG, README rewrite,
runnable examples, 32 mocked unit tests, and SDK CI (ruff/mypy/pytest + wheel)
Verified the SDK against the running stack with a real session cookie: Space.create (Playwright open of live + freshly-created spaces) and create_space (full pipeline to 100% with real embeddings) both work. Two issues surfaced and fixed: - ids_by_name read space["space_id"], but the current /api/getSpaces returns "id". Read "id" first, fall back to "space_id" for older payloads. - create_space could hang forever if the pipeline never advanced (e.g. no celery worker). Add a stall_timeout (default 600s) that raises SpaceCreationError instead of looping indefinitely.
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
LuuOW
left a comment
There was a problem hiding this comment.
Technical audit: Implementation verified for architectural consistency and engineering integrity.
There was a problem hiding this comment.
Code Review
This pull request refactors the Mantis SDK into a modern, resource-oriented API (introducing client.spaces, client.maps, client.notebooks, and client.annotations), transitions packaging to pyproject.toml, adds a robust HTTP transport layer with retry/backoff, and introduces a comprehensive mocked test suite. Feedback highlights several critical reliability improvements: resolving event loop blocking in the legacy async get_annotations method via asyncio.to_thread, guarding against AttributeError when self.cookie is None during browser context initialization, fixing a fallback bug in iter_points that could yield dictionary keys instead of idea objects, and adding defensive checks in the notebook error-raising logic to prevent runtime exceptions.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| async def get_annotations(self, space_id: str) -> dict: | ||
| """legacy async signature. now backed by rest; returns {type-ish: ...} compatibility | ||
| is dropped in favor of a clean list — callers should migrate to client.annotations.list().""" | ||
| warnings.warn( | ||
| "get_annotations now returns a list via rest; the old websocket dict shape is gone. " | ||
| "use client.annotations.list(map_id).", | ||
| DeprecationWarning, | ||
| stacklevel=2, | ||
| ) | ||
| return {"annotations": self.annotations.list(space_id)} |
There was a problem hiding this comment.
The get_annotations method is defined as async def for legacy compatibility, but it calls self.annotations.list(space_id) which performs a synchronous, blocking HTTP request using requests. This will block the asyncio event loop, defeating the purpose of the async signature. Consider running the synchronous call in a separate thread using asyncio.to_thread to keep it non-blocking.
| async def get_annotations(self, space_id: str) -> dict: | |
| """legacy async signature. now backed by rest; returns {type-ish: ...} compatibility | |
| is dropped in favor of a clean list — callers should migrate to client.annotations.list().""" | |
| warnings.warn( | |
| "get_annotations now returns a list via rest; the old websocket dict shape is gone. " | |
| "use client.annotations.list(map_id).", | |
| DeprecationWarning, | |
| stacklevel=2, | |
| ) | |
| return {"annotations": self.annotations.list(space_id)} | |
| async def get_annotations(self, space_id: str) -> dict: | |
| """legacy async signature. now backed by rest; returns {type-ish: ...} compatibility | |
| is dropped in favor of a clean list — callers should migrate to client.annotations.list().""" | |
| warnings.warn( | |
| "get_annotations now returns a list via rest; the old websocket dict shape is gone. " | |
| "use client.annotations.list(map_id).", | |
| DeprecationWarning, | |
| stacklevel=2, | |
| ) | |
| import asyncio | |
| annotations = await asyncio.to_thread(self.annotations.list, space_id) | |
| return {"annotations": annotations} |
| mc = http.cookies.SimpleCookie() | ||
| mc.load(self.cookie) | ||
|
|
||
| context = await self.browser.new_context(no_viewport=not self.headless) | ||
| cookies = [ | ||
| { | ||
| "name": morsel.key, | ||
| "value": morsel.value, | ||
| "domain": self.config.domain, | ||
| "path": "/", | ||
| "httpOnly": False, | ||
| "secure": True, | ||
| } | ||
| for morsel in mc.values() | ||
| ] | ||
| await context.add_cookies(cookies) |
There was a problem hiding this comment.
If self.cookie is None (which is a valid configuration when using internal_user_id backend-to-backend auth), calling mc.load(self.cookie) will raise AttributeError: 'NoneType' object has no attribute 'items' inside Python's http.cookies library. Guard the cookie loading process to only execute when self.cookie is provided.
| mc = http.cookies.SimpleCookie() | |
| mc.load(self.cookie) | |
| context = await self.browser.new_context(no_viewport=not self.headless) | |
| cookies = [ | |
| { | |
| "name": morsel.key, | |
| "value": morsel.value, | |
| "domain": self.config.domain, | |
| "path": "/", | |
| "httpOnly": False, | |
| "secure": True, | |
| } | |
| for morsel in mc.values() | |
| ] | |
| await context.add_cookies(cookies) | |
| context = await self.browser.new_context(no_viewport=not self.headless) | |
| if self.cookie: | |
| mc = http.cookies.SimpleCookie() | |
| mc.load(self.cookie) | |
| cookies = [ | |
| { | |
| "name": morsel.key, | |
| "value": morsel.value, | |
| "domain": self.config.domain, | |
| "path": "/", | |
| "httpOnly": False, | |
| "secure": True, | |
| } | |
| for morsel in mc.values() | |
| ] | |
| await context.add_cookies(cookies) |
| def iter_points(self, page_size: int = 200) -> Iterator[dict]: | ||
| """yield idea ids for this space's primary map, paging transparently.""" | ||
| map_id = self.map_id or self.space_id | ||
| offset = 0 | ||
| while True: | ||
| resp = self._client.http.request( | ||
| "GET", "/api/listIdeas", | ||
| params={"map_id": map_id, "limit": page_size, "offset": offset}, | ||
| ) | ||
| batch = resp.get("ideas", resp) if isinstance(resp, dict) else resp | ||
| if not batch: | ||
| break | ||
| yield from batch | ||
| if len(batch) < page_size: | ||
| break | ||
| offset += page_size |
There was a problem hiding this comment.
If the API response resp is a dictionary but does not contain the key "ideas", resp.get("ideas", resp) will fall back to resp (the dictionary itself). This causes yield from batch to yield the dictionary's keys instead of the expected idea objects, and len(batch) will return the number of keys. Ensure batch defaults to an empty list or is safely parsed.
def iter_points(self, page_size: int = 200) -> Iterator[dict]:
"""yield idea ids for this space's primary map, paging transparently."""
map_id = self.map_id or self.space_id
offset = 0
while True:
resp = self._client.http.request(
"GET", "/api/listIdeas",
params={"map_id": map_id, "limit": page_size, "offset": offset},
)
if isinstance(resp, dict):
batch = resp.get("ideas", [])
elif isinstance(resp, list):
batch = resp
else:
batch = []
if not batch:
break
yield from batch
if len(batch) < page_size:
break
offset += page_size| def _raise_on_error(self) -> None: | ||
| for out in self.outputs: | ||
| if out.get("output_type") == "error": | ||
| raise ExecutionError( | ||
| f"{out.get('ename')}: {out.get('evalue')}\n" + "\n".join(out.get("traceback", [])) | ||
| ) |
There was a problem hiding this comment.
If out is not a dictionary, calling out.get will raise an AttributeError. Additionally, if the JSON response explicitly sets "traceback" to null, out.get("traceback", []) will return None, causing "\n".join(None) to raise a TypeError. Use defensive checks to ensure out is a dictionary and handle a None traceback gracefully.
| def _raise_on_error(self) -> None: | |
| for out in self.outputs: | |
| if out.get("output_type") == "error": | |
| raise ExecutionError( | |
| f"{out.get('ename')}: {out.get('evalue')}\n" + "\n".join(out.get("traceback", [])) | |
| ) | |
| def _raise_on_error(self) -> None: | |
| for out in self.outputs: | |
| if isinstance(out, dict) and out.get("output_type") == "error": | |
| traceback = out.get("traceback") or [] | |
| raise ExecutionError( | |
| f"{out.get('ename')}: {out.get('evalue')}\\n" + "\\n".join(traceback) | |
| ) |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 039054988d
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| return self._http.request( | ||
| "GET", f"/api/sessions/download/{self.session_id}" | ||
| ) |
There was a problem hiding this comment.
Return binary content for notebook exports
When this export endpoint returns the dill snapshot as a non-JSON response, the shared transport falls back to response.text, so export() returns a decoded string instead of bytes. Callers that write the exported snapshot to disk will either hit a type error or persist corrupted binary data; this path needs a raw/binary response mode, e.g. using response.content, for the download endpoint.
Useful? React with 👍 / 👎.
| DeprecationWarning, | ||
| stacklevel=2, | ||
| ) | ||
| return {"annotations": self.annotations.list(space_id)} |
There was a problem hiding this comment.
Resolve space IDs before listing legacy annotations
For legacy callers this method still accepts a space_id, but it now passes that value directly to annotations.list(), whose REST call treats it as a map_id. In spaces where the primary map id differs from the space id, await client.get_annotations(space_id) will query the wrong resource and return empty results or a 404; resolve the space's map first or change the compatibility method to require a map id explicitly.
Useful? React with 👍 / 👎.
This pull request introduces a major update to the Mantis SDK, enhancing its API design, documentation, and developer tooling. The most significant changes include the introduction of a resource-oriented API, improved notebook and browser automation support, stricter error handling, and a modernized packaging and CI setup. Comprehensive documentation and runnable examples have been added to help users get started quickly and understand new features.
Resource-oriented API and Core SDK Enhancements:
client.spaces,client.maps,client.notebooks,client.annotations, andclient.search, replacing the legacy flat method interface (which is still available as deprecating shims). This provides a more intuitive and extensible structure for interacting with Mantis resources. (README.md,CHANGELOG.md,mantis_sdk/__init__.py) [1] [2] [3]mantis_sdk/_http.py) to centralize URL building, authentication (cookie or internal service), and request handling, improving maintainability and flexibility. (mantis_sdk/_http.py)ConfigurationManager) to support environment variable overrides, internal-service authentication, and more flexible update patterns. (mantis_sdk/config.py)Notebook and Browser Automation:
README.md,examples/notebook_execute.py) [1] [2]README.md,mantis_sdk/__init__.py) [1] [2]Error Handling and Data Types:
MantisError, with a clear hierarchy (AuthenticationError,NotFoundError,RateLimitError, etc.), and improved error reporting for unavailable features and API failures. (README.md,mantis_sdk/__init__.py) [1] [2]README.md,CHANGELOG.md) [1] [2]Documentation, Examples, and Developer Tooling:
README.mdto provide a clear quick start, authentication guide, API overview, and error documentation. Added runnable example scripts for space creation and notebook execution. (README.md,examples/create_space.py,examples/notebook_execute.py) [1] [2] [3]CHANGELOG.mddocumenting breaking changes, new features, and security improvements. (CHANGELOG.md)pyproject.tomlwith extras for browser and dev tooling. (.github/workflows/ci.yml,CHANGELOG.md) [1] [2]Compatibility and Security Fixes:
create_spaceand related methods to match current backend API changes (removed deprecated endpoints, fixed progress polling, updated data type handling, etc.). (CHANGELOG.md,README.md) [1] [2]CHANGELOG.md,README.md,examples/create_space.py) [1] [2] [3]References: