Skip to content

feat: function calling across models #589

@sroussey

Description

@sroussey

Feature Description

Improve function calling support across models. right now, i get 3 failures out of four for this test:

import { getLlama, LlamaChat, resolveModelFile } from "node-llama-cpp";
import { afterAll, describe, expect, it } from "vitest";

const models = [
  { label: "FunctionGemma 270M", url: "hf:unsloth/functiongemma-270m-it-GGUF:Q8_0" },
  { label: "LFM2 1.2B Tool", url: "hf:LiquidAI/LFM2-1.2B-Tool-GGUF:Q8_0" },
  { label: "Qwen2.5 Coder 1.5B", url: "hf:bartowski/Qwen2.5-Coder-1.5B-Instruct-GGUF:Q4_K_M" },
  { label: "Llama 3.2 1B", url: "hf:unsloth/Llama-3.2-1B-Instruct-GGUF:Q4_K_M" },
];

const functions = {
  get_weather: {
    description: "Get the current weather for a city.",
    params: {
      type: "object" as const,
      properties: {
        location: { type: "string" as const },
      },
      required: ["location"],
    },
  },
};

describe("node-llama-cpp native function calling", () => {
  const timeout = 10 * 60 * 1000;
  let llama: Awaited<ReturnType<typeof getLlama>> | undefined;

  afterAll(async () => {
    await llama?.dispose();
  });

  for (const { label, url } of models) {
    it(
      label,
      async () => {
        llama ??= await getLlama();
        const modelPath = await resolveModelFile(url, "./models");
        const model = await llama.loadModel({ modelPath });
        const context = await model.createContext({ flashAttention: true });
        const sequence = context.getSequence();
        const chat = new LlamaChat({ contextSequence: sequence });

        const res = await chat.generateResponse(
          [{ type: "user", text: "What is the weather in San Francisco?" }],
          { functions, maxTokens: 200, seed: 42 }
        );

        const hasFnCalls = (res.functionCalls?.length ?? 0) > 0;

        console.log(`\n--- ${label} (wrapper: ${chat.chatWrapper.wrapperName}) ---`);
        console.log(`functionCalls: ${hasFnCalls ? JSON.stringify(res.functionCalls) : "NONE"}`);
        console.log(`response text: ${JSON.stringify(res.response.slice(0, 200))}`);
        if (!hasFnCalls && res.response) {
          console.log(`⚠ tool call embedded in text, not returned via functionCalls`);
        }

        chat.dispose({ disposeSequence: false });
        sequence.dispose();
        await context.dispose();
        await model.dispose();
        // expect(hasFnCalls).toBe(true);
      },
      timeout
    );
  }
});
[node-llama-cpp] A prebuilt binary was not found, falling back to using no GPU
stderr | packages/test/src/test/ai-provider/LlamaCpp_NativeToolCalling.integration.test.ts > node-llama-cpp native function calling > FunctionGemma 270M
[node-llama-cpp] load: control-looking token:    212 '</s>' was not control-type; this is probably a bug in the model. its type will be overridden

stdout | packages/test/src/test/ai-provider/LlamaCpp_NativeToolCalling.integration.test.ts > node-llama-cpp native function calling > FunctionGemma 270M

--- FunctionGemma 270M (wrapper: Gemma) ---
functionCalls: NONE
response text: "I cannot assist with retrieving weather information for San Francisco. My current capabilities are limited to calling specific functions as needed."
⚠ tool call embedded in text, not returned via functionCalls

stdout | packages/test/src/test/ai-provider/LlamaCpp_NativeToolCalling.integration.test.ts > node-llama-cpp native function calling > LFM2 1.2B Tool

--- LFM2 1.2B Tool (wrapper: ChatML) ---
functionCalls: NONE
response text: "I can help you with that, but I need the function to retrieve the weather information. Could you please provide the function call?\n\n```typescript\nfunction get_weather(params: {location: string});\n```\n"
⚠ tool call embedded in text, not returned via functionCalls

stderr | packages/test/src/test/ai-provider/LlamaCpp_NativeToolCalling.integration.test.ts > node-llama-cpp native function calling > Qwen2.5 Coder 1.5B
[node-llama-cpp] load: control-looking token: 128247 '</s>' was not control-type; this is probably a bug in the model. its type will be overridden

stdout | packages/test/src/test/ai-provider/LlamaCpp_NativeToolCalling.integration.test.ts > node-llama-cpp native function calling > Qwen2.5 Coder 1.5B

--- Qwen2.5 Coder 1.5B (wrapper: Qwen) ---
functionCalls: NONE
response text: "{\"name\": \"get_weather\", \"arguments\": {\"location\": \"San Francisco\"}}"
⚠ tool call embedded in text, not returned via functionCalls

stdout | packages/test/src/test/ai-provider/LlamaCpp_NativeToolCalling.integration.test.ts > node-llama-cpp native function calling > Llama 3.2 1B

--- Llama 3.2 1B (wrapper: Llama 3.2 lightweight) ---
functionCalls: [{"functionName":"get_weather","params":{"location":"San Francisco"},"raw":["{\"name\": \"get_weather\", \"parameters\": {\"location\": \"San Francisco\"}}",{"type":"specialTokensText","value":"<|eot_id|>"}]}]
response text: ""

 ✓ packages/test/src/test/ai-provider/LlamaCpp_NativeToolCalling.integration.test.ts (4 tests) 14068ms
   ✓ node-llama-cpp native function calling (4)
     ✓ FunctionGemma 270M  1644ms
     ✓ LFM2 1.2B Tool  5359ms
     ✓ Qwen2.5 Coder 1.5B  3320ms
     ✓ Llama 3.2 1B  3742ms

 Test Files  1 passed (1)
      Tests  4 passed (4)
   Start at  16:04:15
   Duration  15.59s (transform 910ms, setup 1.07s, import 352ms, tests 14.07s, environment 0ms)

The Solution

I expected res.functionCalls to be filled. Is there a model I should test that is known to work?

Considered Alternatives

I scan text steam for <|tool_call_start|>[get_weather(location="San Francisco")]<|tool_call_end|>, but while the onnx version does this, this version is stripping the token <|tool_call_start|> and <|tool_call_end|>.

Additional Context

No response

Related Features to This Feature Request

  • Metal support
  • CUDA support
  • Vulkan support
  • Grammar
  • Function calling

Are you willing to resolve this issue by submitting a Pull Request?

Yes, I have the time, but I don't know how to start. I would need guidance.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No fields configured for Feature.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions