Conversation
|
I'm rerunning the test first. It seems like we ran out of quota for testing notebook. |
datvo06
left a comment
There was a problem hiding this comment.
At the first glance, I think the check in call_assistant can cause some troubles (below). I don't have a clear answer on how to address them yet, but we can add tests, mark xfail and create issues.
| return_annotation = typing.get_args(tool_sig.return_annotation)[0] | ||
| if not issubclass( | ||
| _simple_type(return_annotation), response_format.base | ||
| ): |
There was a problem hiding this comment.
I think this might cause trouble in cases where we use IsFinal with return_annotation that doesn't match the outer template.
There was a problem hiding this comment.
For example, this script fail due to mismatching between final_text return type and Template return type. But I guess it's ok.
from typing import Annotated
from effectful.handlers.llm import Template, Tool
from effectful.handlers.llm.completions import (
LiteLLMProvider,
ToolCallDecodingError,
completion,
)
from effectful.handlers.llm.template import IsFinal
from effectful.ops.semantics import handler
from effectful.ops.syntax import ObjectInterpretation, implements
from effectful.ops.types import NotHandled
from tests.test_handlers_llm_template import make_tool_call_response
@Tool.define
def final_text() -> Annotated[str, IsFinal]:
"""Return a final text result."""
return "123"
@Template.define
def task() -> int:
"""Call final_text."""
raise NotHandled
with handler(LiteLLMProvider(model="gpt-4o-mini")):
task()
Result:
File "/Users/nguyendat/Marc/effectful/effectful/handlers/llm/completions.py", line 239, in call_assistant
raise ToolCallDecodingError(
effectful.handlers.llm.completions.ToolCallDecodingError: Error decoding tool call 'final_text': IsFinal tool 'final_text' has signature <Signature () -> Annotated[str, <effectful.handlers.llm.template._IsFinalAnnotation object at 0x100d3d100>]>, but the enclosing template expects <class 'int'>.. Please provide a valid response and try again.
There was a problem hiding this comment.
This case is more troublesome, but it is also because python forbids class check. Still, this would work fine if we don't have the check there for IsFinal.
from typing import Annotated, TypedDict
from effectful.handlers.llm import Template, Tool
from effectful.handlers.llm.completions import (
LiteLLMProvider,
ToolCallDecodingError,
completion,
)
from effectful.handlers.llm.template import IsFinal
from effectful.ops.semantics import handler
from effectful.ops.syntax import ObjectInterpretation, implements
from effectful.ops.types import NotHandled
from tests.test_handlers_llm_template import make_tool_call_response
class Payload(TypedDict):
x: int
@Tool.define
def final_payload() -> Annotated[Payload, IsFinal]:
"""Return final payload."""
return {"x": 1}
@Template.define
def task() -> Payload:
"""Call final_payload."""
raise NotHandled
with handler(LiteLLMProvider(model="gpt-4o-mini")):
task()Result:
File "/Users/nguyendat/Marc/effectful/effectful/handlers/llm/completions.py", line 239, in call_assistant
raise ToolCallDecodingError(
effectful.handlers.llm.completions.ToolCallDecodingError: Error decoding tool call 'final_payload': TypedDict does not support instance and class checks. Please provide a valid response and try again.
(effectful) ➜ effectful git:(eb-final-answer) ✗ python effectful/handlers/llm/repro_typed_dict.pyThere was a problem hiding this comment.
Or Literal:
from typing import Annotated, Literal, TypedDict
from effectful.handlers.llm import Template, Tool
from effectful.handlers.llm.completions import (
LiteLLMProvider,
ToolCallDecodingError,
completion,
)
from effectful.handlers.llm.template import IsFinal
from effectful.ops.semantics import handler
from effectful.ops.syntax import ObjectInterpretation, implements
from effectful.ops.types import NotHandled
from tests.test_handlers_llm_template import make_tool_call_response
@Tool.define
def final_payload() -> Annotated[Literal[1, 2, 3], IsFinal]:
"""Return final payload."""
return 1
@Template.define
def task() -> Literal[1, 2, 3]:
"""Call final_payload."""
raise NotHandled
with handler(LiteLLMProvider(model="gpt-4o-mini")):
task()Result:
File "/Users/nguyendat/Marc/effectful/effectful/handlers/llm/completions.py", line 239, in call_assistant
raise ToolCallDecodingError(
effectful.handlers.llm.completions.ToolCallDecodingError: Error decoding tool call 'final_payload': Subscripted generics cannot be used with class and instance checks. Please provide a valid response and try again.There was a problem hiding this comment.
The behavior on these examples is consistent with the design choices laid out in the PR description, so those choices probably need to be revisited. For maximum flexibility we might want to let the LLM choose whether a tool call is final, instead of relying solely on the annotation as in this PR. For example, we could inject a fake is_final argument into every tool schema sent to the LLM and read off its value from the tool call request. We should probably also collect a few more examples like this that reflect more realistic use cases of this behavior.
|
If the primary goal is to enable the LLM to use tools that produce output that doesn't roundtrip through text, it might be simpler to attack that problem directly. I'm skeptical that tool calling training will allow LLMs to effectively use tools that have the side effect of ending the interaction. One alternative could be to encode "unencodable" tool output using a textual pointer. This could be a hash or some other unique string that the LLM could either produce as output or pass as an argument to a further tool call. The encoding and decoding logic would be responsible for maintaining the shared state that would map these pointers back to their objects. This approach would also have the pleasant side effect of enabling the LLM to chain tools that produce e.g. images or video without needing to write a script. |
Isn't this just a funny sort of structured output, followed by extra information from the tool call in the next user message? It's possible that it doesn't work that well in practice but I know it's a pattern that's used quite a bit in |
Addresses #549
This PR adds a new
effectful.ops.types.AnnotationforToolreturn types,effectful.handlers.llm.template.IsFinal, such that when aTemplategenerates a call to anIsFinal-annotatedToolwhose return type is compatible with that of theTemplate, the result of the tool call will be returned directly fromcall_assistantandTemplate.__apply__as the final result of theTemplatecall with no furthercall_assistantturns or serialization/postprocessing.This PR corresponds to one fairly conservative corner of the design space sketched in #549:
IsFinaltool call cannot be decoded with valid arguments and a static output type that matches the originalTemplate's, or if executing the tool call triggers an error at runtime that is captured byRetryLLMHandler, theIsFinalannotation will be disregarded and the LLM will proceed to anothercall_assistantstep as usual.Templateto use arbitrary type-compatible but un-annotatedTools to get a final answer - if aToolis not an explicitlyIsFinal-annotated final-answer tool, its output will always be sent through at least one morecall_assistantstep.Templateto use anIsFinal-annotatedToolwith incompatible return type as a non-final tool; attempting to call such a tool will trigger an error.IsFinaltool call to compute a final answer when at least one type-compatibleIsFinal-annotatedToolis available - the LLM can use different tools or even none at all to get a result.Any or all of these points might be things to consider changing prior to landing this PR - I would expect the functionality in #549 and especially this instantiation to be useful mostly in cases like #526 where we have something very general like a code-generation or text-to-image-generation tool which can be used with many
Templates and which we always want to use whenever it is type-compatible.This would probably be much easier to use in conjunction with polymorphism #489