Add support for final-answer tool calls by eb8680 · Pull Request #570 · BasisResearch/effectful

eb8680 · 2026-02-16T01:22:11Z

Addresses #549

This PR adds a new effectful.ops.types.Annotation for Tool return types, effectful.handlers.llm.template.IsFinal, such that when a Template generates a call to an IsFinal-annotated Tool whose return type is compatible with that of the Template, the result of the tool call will be returned directly from call_assistant and Template.__apply__ as the final result of the Template call with no further call_assistant turns or serialization/postprocessing.

This PR corresponds to one fairly conservative corner of the design space sketched in #549:

Only well-formed and successful tool calls can be used as final answers - if an LLM-generated IsFinal tool call cannot be decoded with valid arguments and a static output type that matches the original Template's, or if executing the tool call triggers an error at runtime that is captured by RetryLLMHandler, the IsFinal annotation will be disregarded and the LLM will proceed to another call_assistant step as usual.
it does not allow a Template to use arbitrary type-compatible but un-annotated Tools to get a final answer - if a Tool is not an explicitly IsFinal-annotated final-answer tool, its output will always be sent through at least one more call_assistant step.
It does not allow a Template to use an IsFinal-annotated Tool with incompatible return type as a non-final tool; attempting to call such a tool will trigger an error.
It does not require the LLM to always generate an IsFinal tool call to compute a final answer when at least one type-compatible IsFinal-annotated Tool is available - the LLM can use different tools or even none at all to get a result.

Any or all of these points might be things to consider changing prior to landing this PR - I would expect the functionality in #549 and especially this instantiation to be useful mostly in cases like #526 where we have something very general like a code-generation or text-to-image-generation tool which can be used with many Templates and which we always want to use whenever it is type-compatible.

This would probably be much easier to use in conjunction with polymorphism #489

datvo06 · 2026-02-24T14:58:47Z

I'm rerunning the test first. It seems like we ran out of quota for testing notebook.

datvo06

At the first glance, I think the check in call_assistant can cause some troubles (below). I don't have a clear answer on how to address them yet, but we can add tests, mark xfail and create issues.

datvo06 · 2026-02-24T16:49:28Z

effectful/handlers/llm/completions.py

+                return_annotation = typing.get_args(tool_sig.return_annotation)[0]
+                if not issubclass(
+                    _simple_type(return_annotation), response_format.base
+                ):


I think this might cause trouble in cases where we use IsFinal with return_annotation that doesn't match the outer template.

For example, this script fail due to mismatching between final_text return type and Template return type. But I guess it's ok.

from typing import Annotated from effectful.handlers.llm import Template, Tool from effectful.handlers.llm.completions import ( LiteLLMProvider, ToolCallDecodingError, completion, ) from effectful.handlers.llm.template import IsFinal from effectful.ops.semantics import handler from effectful.ops.syntax import ObjectInterpretation, implements from effectful.ops.types import NotHandled from tests.test_handlers_llm_template import make_tool_call_response @Tool.define def final_text() -> Annotated[str, IsFinal]: """Return a final text result.""" return "123" @Template.define def task() -> int: """Call final_text.""" raise NotHandled with handler(LiteLLMProvider(model="gpt-4o-mini")): task()

Result:

File "/Users/nguyendat/Marc/effectful/effectful/handlers/llm/completions.py", line 239, in call_assistant raise ToolCallDecodingError( effectful.handlers.llm.completions.ToolCallDecodingError: Error decoding tool call 'final_text': IsFinal tool 'final_text' has signature <Signature () -> Annotated[str, <effectful.handlers.llm.template._IsFinalAnnotation object at 0x100d3d100>]>, but the enclosing template expects <class 'int'>.. Please provide a valid response and try again.

This case is more troublesome, but it is also because python forbids class check. Still, this would work fine if we don't have the check there for IsFinal.

from typing import Annotated, TypedDict from effectful.handlers.llm import Template, Tool from effectful.handlers.llm.completions import ( LiteLLMProvider, ToolCallDecodingError, completion, ) from effectful.handlers.llm.template import IsFinal from effectful.ops.semantics import handler from effectful.ops.syntax import ObjectInterpretation, implements from effectful.ops.types import NotHandled from tests.test_handlers_llm_template import make_tool_call_response class Payload(TypedDict): x: int @Tool.define def final_payload() -> Annotated[Payload, IsFinal]: """Return final payload.""" return {"x": 1} @Template.define def task() -> Payload: """Call final_payload.""" raise NotHandled with handler(LiteLLMProvider(model="gpt-4o-mini")): task()

Result:

File "/Users/nguyendat/Marc/effectful/effectful/handlers/llm/completions.py", line 239, in call_assistant raise ToolCallDecodingError( effectful.handlers.llm.completions.ToolCallDecodingError: Error decoding tool call 'final_payload': TypedDict does not support instance and class checks. Please provide a valid response and try again. (effectful) ➜ effectful git:(eb-final-answer) ✗ python effectful/handlers/llm/repro_typed_dict.py

Or Literal:

from typing import Annotated, Literal, TypedDict from effectful.handlers.llm import Template, Tool from effectful.handlers.llm.completions import ( LiteLLMProvider, ToolCallDecodingError, completion, ) from effectful.handlers.llm.template import IsFinal from effectful.ops.semantics import handler from effectful.ops.syntax import ObjectInterpretation, implements from effectful.ops.types import NotHandled from tests.test_handlers_llm_template import make_tool_call_response @Tool.define def final_payload() -> Annotated[Literal[1, 2, 3], IsFinal]: """Return final payload.""" return 1 @Template.define def task() -> Literal[1, 2, 3]: """Call final_payload.""" raise NotHandled with handler(LiteLLMProvider(model="gpt-4o-mini")): task()

Result:

File "/Users/nguyendat/Marc/effectful/effectful/handlers/llm/completions.py", line 239, in call_assistant raise ToolCallDecodingError( effectful.handlers.llm.completions.ToolCallDecodingError: Error decoding tool call 'final_payload': Subscripted generics cannot be used with class and instance checks. Please provide a valid response and try again.

The behavior on these examples is consistent with the design choices laid out in the PR description, so those choices probably need to be revisited. For maximum flexibility we might want to let the LLM choose whether a tool call is final, instead of relying solely on the annotation as in this PR. For example, we could inject a fake is_final argument into every tool schema sent to the LLM and read off its value from the tool call request. We should probably also collect a few more examples like this that reflect more realistic use cases of this behavior.

jfeser · 2026-02-25T05:20:29Z

If the primary goal is to enable the LLM to use tools that produce output that doesn't roundtrip through text, it might be simpler to attack that problem directly. I'm skeptical that tool calling training will allow LLMs to effectively use tools that have the side effect of ending the interaction.

One alternative could be to encode "unencodable" tool output using a textual pointer. This could be a hash or some other unique string that the LLM could either produce as output or pass as an argument to a further tool call. The encoding and decoding logic would be responsible for maintaining the shared state that would map these pointers back to their objects. This approach would also have the pleasant side effect of enabling the LLM to chain tools that produce e.g. images or video without needing to write a script.

eb8680 · 2026-02-25T07:33:02Z

I'm skeptical that tool calling training will allow LLMs to effectively use tools that have the side effect of ending the interaction.

Isn't this just a funny sort of structured output, followed by extra information from the tool call in the next user message? It's possible that it doesn't work that well in practice but I know it's a pattern that's used quite a bit in smolagents, among other libraries.

eb8680 added 4 commits February 15, 2026 18:33

Use tool to compute final answer

1b8d825

stash

6495fc2

stash

0005c1c

interaction with retry

d46da71

eb8680 linked an issue Feb 16, 2026 that may be closed by this pull request

Templates should be able to return Tool call results as final answers #549

Open

eb8680 added 4 commits February 15, 2026 21:14

subclass

38bd78d

lint

f0e8960

is_final loop variable

71e869c

rename

97fd4ab

eb8680 added the module:llm label Feb 19, 2026

eb8680 added 9 commits February 24, 2026 01:41

Merge branch 'master' into eb-final-answer

f25ae9f

compress

e656c58

inline helper

94ce38f

remove dumb test

18485ce

nit

61e55d6

lint

7a2af1e

fix tests

bab83a4

remove more dumb tests

2d8b70e

condense

e6c4496

eb8680 marked this pull request as ready for review February 24, 2026 07:51

eb8680 requested a review from datvo06 February 24, 2026 07:51

remove pytest-timeout

0444b8c

datvo06 requested changes Feb 24, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Add support for final-answer tool calls#570

Add support for final-answer tool calls#570
eb8680 wants to merge 18 commits intomasterfrom
eb-final-answer

eb8680 commented Feb 16, 2026 •

edited

Loading

Uh oh!

datvo06 commented Feb 24, 2026

Uh oh!

datvo06 left a comment

Uh oh!

datvo06 Feb 24, 2026

Uh oh!

datvo06 Feb 24, 2026

Uh oh!

datvo06 Feb 24, 2026

Uh oh!

datvo06 Feb 24, 2026

Uh oh!

eb8680 Feb 25, 2026

Uh oh!

jfeser commented Feb 25, 2026

Uh oh!

eb8680 commented Feb 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments

Conversation

eb8680 commented Feb 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

datvo06 commented Feb 24, 2026

Uh oh!

datvo06 left a comment

Choose a reason for hiding this comment

Uh oh!

datvo06 Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

datvo06 Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

datvo06 Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

datvo06 Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

eb8680 Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

jfeser commented Feb 25, 2026

Uh oh!

eb8680 commented Feb 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

eb8680 commented Feb 16, 2026 •

edited

Loading