Skip to content

500 errors when restoring previous session in SageMaker deployment #1767

@kmcginnes

Description

@kmcginnes

Description

When Graph Explorer is deployed on a SageMaker notebook instance via Docker, a user encountered multiple 500 errors when clicking "Restore previous session." A subsequent manual connection re-sync resolved the issue. The exact trigger is unknown, but the instance had not been used for several days.

The browser console showed:

  • Multiple Response status 500 received errors from openCypher queries
  • Query failed to execute: Array(3) NetworkError: Operation terminated (internal error) with a stack trace through queryFn
  • CORS errors on manifest.jsonAccess-Control-Allow-Origin header missing
  • CSS file rejected due to MIME type mismatch (text/html returned instead of CSS), suggesting the proxy returned an error page instead of the static asset

The CSS MIME type error and CORS failures suggest the issue is in the proxy layer (Jupyter proxy / SageMaker auth), not in Neptune or the Graph Explorer server directly. The proxy appears to have been in a degraded state, returning HTML error pages for all requests including static assets.

Environment

  • OS: Amazon Linux 2 (SageMaker notebook instance)
  • Browser: Unknown (Chromium-based, based on console output)
  • Graph Explorer Version: Latest (Docker deployment with auto-restart)
  • Graph Database & Version: Amazon Neptune (openCypher)
  • Deployment: Docker on SageMaker notebook instance, accessed via Jupyter proxy

Steps to Reproduce

Not reliably reproducible. The issue was observed after the instance had been idle for several days. Possible reproduction strategies:

  • docker stop <container> && docker start <container>, then immediately access via the Jupyter proxy URL
  • Leave a browser tab open for an extended period without use, then interact with Graph Explorer

Expected Behavior

Graph Explorer should either:

  • Successfully restore the previous session, or
  • Display a clear, actionable error message indicating the server is not ready, with a retry option

Root Cause Analysis

Unknown. Possible contributing factors:

  1. Stale browser assets — The browser tab may have been open with cached JS/CSS bundles. If the container restarted at any point, Vite-hashed filenames would differ, causing the proxy to return error pages for missing assets.
  2. Proxy state — The SageMaker Jupyter proxy may have been in a bad state, either due to a container restart, idle timeout, or other infrastructure event.
  3. No retry/backoff — The "Restore session" flow sends queries immediately without checking if the server is healthy, so transient proxy failures surface as unrecoverable errors.

Workaround

Click the connection re-sync button or refresh the browser tab. The issue self-resolves after one successful request.


Important

If you are interested in working on this issue, please leave a comment.

Tip

Please use a 👍 reaction to provide a +1/vote. This helps the community and maintainers prioritize this request.

Metadata

Metadata

Assignees

No one assigned

    Labels

    fundamentalreliabilityIssues relating to improvements in reliability

    Type

    No fields configured for Bug.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions