Skip to content

Convert hello-world and vllm examples to single-shot mode once single-shot payment lands #5

@rickstaa

Description

@rickstaa

Context

The request/response examples — hello-world and vllm — are conceptually single-shot (one request in, one response out). But the single-shot payment layer isn't implemented yet (blocked on go-livepeer#3955).

Until it lands, both run on the persistent runner mode (the SDK / static-config default) as a stopgap. On-chain that bills per second of wall-clock for the whole time the session is open, which overbills short request/response calls (a brief job still pays the ~60s preload floor). So these examples are offchain-only in practice for now.

echo is not in scope — realtime trickle streaming is genuinely persistent (Josh set it that way deliberately; cf. his single-shot ping-pong). It stays persistent.

Task (blocked on go-livepeer#3955)

When single-shot on-chain payment is implemented upstream:

  • hello-world: register_runner(..., mode="single-shot")
  • vllm: runners.json"mode": "single-shot"
  • Re-enable / correct the on-chain (paid) sections for both (drop the persistent-stopgap overbilling caveat)
  • Update the main README runner-mode note and the per-example tables

Blocked by

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions