Skip to content

Added keep alive for ssh & channel timeout for sftp#95

Merged
ltekieli merged 1 commit intoeclipse-score:mainfrom
draganbjedov:main
Apr 28, 2026
Merged

Added keep alive for ssh & channel timeout for sftp#95
ltekieli merged 1 commit intoeclipse-score:mainfrom
draganbjedov:main

Conversation

@draganbjedov
Copy link
Copy Markdown
Contributor

@draganbjedov draganbjedov commented Apr 27, 2026

  • Added possibility to set keep alive interval for SSH connection. Default is None. If it is provided, after interval seconds without sending any data over the connection, a "keepalive" packet will be sent (ignored by remote host). This can be useful to keep connections alive over a NAT, for an example.
  • Added possibility to set SFTP channel timeout. Default is None. If it provided, subsequent channel read/write operations will raise a timeout exception if the timeout period value has elapsed before the operation has completed.

Copy link
Copy Markdown
Member

@ltekieli ltekieli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What looks good

  • The keep-alive setup logic is correct — checking for transport before calling set_keepalive.
  • The SFTP channel timeout implementation is clean and minimal.
  • No changes to existing behavior when the new parameters are not used (defaults are None).

Issues to address

1. Type annotations say int but default is None

  • ssh.py:31keep_alive_interval: int = None
  • sftp.py:31channel_timeout: int = None

These should be Optional[int] (or int | None for 3.10+). The docstrings also say :param int which is misleading.

2. channel_timeout is a public attribute (self.channel_timeout)
All other instance attributes use the _ prefix convention (self._ssh, self._sftp, self._new_ssh). Should be self._channel_timeout for consistency.

3. __exit__ logging is unconditional — runs even without keep-alive
The transport-state logging in __exit__ runs on every Ssh close, not just when keep-alive is enabled. This adds noise for all existing users of Ssh who don't use the new feature. Consider guarding it with if self._keep_alive_interval is not None:, or at minimum use logger.debug instead of logger.info/logger.warning/logger.error.

4. __exit__ logs an error for any exception, not just connection-loss
If the with block raises a ValueError or KeyError (application bugs), this will log "SSH connection lost unexpectedly: ValueError(...)", which is misleading. The SSH connection may be perfectly healthy; the exception just propagated through the context manager. Consider only logging for SSH/socket-related exceptions, or downgrading to debug.

5. SO_KEEPALIVE without tuning TCP keepalive parameters
Setting SO_KEEPALIVE without also setting TCP_KEEPIDLE, TCP_KEEPINTVL, and TCP_KEEPCNT relies on OS defaults (typically 2 hours idle before the first probe on Linux). The paramiko set_keepalive() already sends application-level keep-alives at the requested interval, so the SO_KEEPALIVE setsockopt adds little value and could be confusing. Consider either removing it or also setting the TCP-level parameters to match the interval.

6. Redundant "established" log message when keep-alive is enabled
Line 102 already logs "SSH connection to {ip} established". When keep-alive is enabled, the new code logs "SSH connection to {ip} established — monitoring liveness..." — two "established" messages for the same connection. Consider rewording to just mention the keep-alive activation, e.g. "Keep-alive enabled for {ip} every {interval}s".

7. Sftp.__init__ doesn't forward keep_alive_interval to Ssh
When Sftp creates its own Ssh (the self._new_ssh = True path), the new keep_alive_interval parameter of Ssh is not passed through. If a user wants both channel timeout and keep-alive on an SFTP-managed SSH connection, they can't get it without providing a pre-configured Ssh instance.

8. No tests for the new functionality
There are existing tests in test/test_ssh.py but no tests covering keep-alive or channel timeout behavior. At minimum, a unit test verifying that set_keepalive is called on the transport when the parameter is provided would be valuable.

Minor

  • The PR description is empty — it would benefit from explaining the use case (long-lived transfers dropping, etc.).
  • Log messages use an em-dash () which is unusual for log output — a regular dash or colon would be more conventional for grep-ability.

@draganbjedov draganbjedov changed the title Added keep_alive for ssh & channel timeout for sftp Added keep alive for ssh & channel timeout for sftp Apr 28, 2026
Copy link
Copy Markdown
Member

@ltekieli ltekieli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates — the original issues are well addressed. Two minor new observations:

1. Docstring ordering mismatch in sftp.py
The parameter signature has channel_timeout before keep_alive_interval (lines 32-33), but the docstring lists them in the opposite order (lines 46-48). Should match the signature order.

2. ssh.py:132-133 — "will be closed normally" can be misleading
When keep_alive_interval is set, the else branch at line 132 logs "will be closed normally" when transport is either None or still active. If transport is None (already gone), that message is inaccurate. Consider splitting the condition:

if transport and not transport.is_active():
    logger.warning(f"SSH connection to {self._target_ip} was no longer alive at close time")
elif transport:
    logger.info(f"SSH connection to {self._target_ip} will be closed normally")
else:
    logger.warning(f"SSH connection to {self._target_ip} has no transport at close time")

- Added possibility to set keep alive interval for SSH. Default is None.
  If it is provided, after interval seconds without sending any data over
  the connection, a "keepalive" packet will be sent (ignored by remote host).
  This can be useful to keep connections alive over a NAT, for an example.
- Added possibility to set SFTP channel timeout. Default is None.
  If it provided, subsequent channel read/write operations will raise
  a timeout exception if the timeout period value has elapsed before
  the operation has completed.
@ltekieli ltekieli added this pull request to the merge queue Apr 28, 2026
Merged via the queue into eclipse-score:main with commit 8d47199 Apr 28, 2026
5 of 6 checks passed
@github-project-automation github-project-automation Bot moved this from Backlog to Done in TST - Testing Community Apr 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants