Training adapter, it won't call my tool

Question

MichaelOShea OP

Created 3w

Replies 6

Boosts 0

Participants 2

Hi all.

My adapter model just won't invoke my tool.

The problem I am having is covered in an older post: https://developer.apple.com/forums/thread/794839?answerId=852262022#852262022

Sadly the thread dies there and no resolution is seen in that thread.

It's worth noting that I have developed an AI chatbot built around LanguageModelSession to which I feed the exact same system prompt that I feed to my training set (pasted further in this post). The AI chatbot works perfectly, the tool is invoked when needed. I am training the adapter model because the base model whilst capable doesn't produce the quality I'm looking for.

So here's the template of an item in my training set:

[
    {
        'role': 'system', 
        'content': systemPrompt,
        'tools': [TOOL_DEFINITION]
     },
     {
         'role': 'user', 
         'content': entry['prompt']
     },
     {
          'role': 'assistant',
          'content': entry['code']
     }
]

where TOOL_DEFINITION =

{
    'type': 'function',
    'function': {
        'name': 'WriteUbersichtWidgetToFileSystem',
        'description': 'Writes an Übersicht Widget to the file system. Call this tool as the last step in processing a prompt that generates a widget.',
        'parameters': {
            'type': 'object',
            'properties': {
                'jsxContent': {
                    'type': 'string',
                    'description': 'Complete JSX code for an Übersicht widget. This should include all required exports: command, refreshFrequency, render, and className. The JSX should be a complete, valid Übersicht widget file.'
                }
            },
            'required': ['jsxContent']
        }
    }

... and systemPrompt =

        A conversation between a user and a helpful assistant. You are an Übersicht widget designer. Create Übersicht widgets when requested by the user.

        IMPORTANT: You have access to a tool called WriteUbersichtWidgetToFileSystem. When asked to create a widget, you MUST call this tool.

        ### Tool Usage:
        Call WriteUbersichtWidgetToFileSystem with complete JSX code that implements the Übersicht Widget API. Generate custom JSX based on the user's specific request - do not copy the example below.

        ### Übersicht Widget API (REQUIRED):
        Every Übersicht widget MUST export these 4 items:
        - export const command: The bash command to execute (string)
        - export const refreshFrequency: Refresh rate in milliseconds (number)
        - export const render: React component function that receives {output} prop (function)
        - export const className: CSS positioning for absolute placement (string)

        Example format (customize for each request):
        WriteUbersichtWidgetToFileSystem({jsxContent: `export const command = "echo hello"; export const refreshFrequency = 1000; export const render = ({output}) => { return <div>{output}</div>; }; export const className = "top: 20px; left: 20px;"`})

        ### Rules:
        - The terms "ubersicht widget", "widget", "a widget", "the widget" must all be interpreted as "Übersicht widget"
        - Generate complete, valid JSX code that follows the Übersicht widget API
        - When you generate a widget, don't just show JSON or code - you MUST call the WriteUbersichtWidgetToFileSystem tool
        - Report the results to the user after calling the tool

        ### Examples:
        - "Generate a Übersicht widget" → Use WriteUbersichtWidgetToFileSystem tool
        - "Can you add a widget that shows the time" → Use WriteUbersichtWidgetToFileSystem tool
        - "Create a widget with a button" → Use WriteUbersichtWidgetToFileSystem tool

When the script that I use to compose the full training set is executed, entry['prompt'] and entry['code'] contain the prompt and the resulting JSX code for one of the examples I'm feeding to the training session. This is repeated for about 60 such examples that I have in my sample data collection.

Thanks for any help.

Michael

Answered by carinapeng in 865245022

Hi there! Thanks for the patience!

Could you please try this patch and see if this fixes the problem in invoking tool with adapter training?

class InstructMessagesDataset(Dataset):
...
    def __getitem__(self, index: int) -> dict[str, list[int]]:
        sample = copy.deepcopy(self.data[index])
        if self.data_transform:
            sample = self.data_transform(sample, index)

        # Extract tools from the system message if present
        tools = None
        if sample and isinstance(sample, list) and len(sample) > 0:
            first_message = sample[0]
            if isinstance(first_message, dict) and "tools" in first_message:
                tools = first_message["tools"]

        # Pass tools to the preprocessor if available
        sample = self.preprocessor(sample, tools=tools) if tools else self.preprocessor(sample)

        item = {BatchKey.INPUT: sample.input_ids}
        if sample.label_ids:
            item[BatchKey.LABEL] = sample.label_ids
        return item

This is a bug, it is likely this was causing the problem for you.

Boost

Answer 1

MichaelOShea OP

3w

I have continued my research and have discovered that there must be a tool_calls assistant message. The tool gets called once now :-D It doesn't get called again after that. I'll continue testing;

{
    'role': 'assistant',
    'content': '',
    'tool_calls': [
        {
            'id': tool_call_id,
            'type': 'function',
            'function': {
                'name': 'WriteUbersichtWidgetToFileSystem',
                'arguments': arguments_json
            }
        }
    ]
}

0

Answer 2

carinapeng OP

2w

Accepted Answer

Hi there! Thanks for the patience!

Could you please try this patch and see if this fixes the problem in invoking tool with adapter training?

class InstructMessagesDataset(Dataset):
...
    def __getitem__(self, index: int) -> dict[str, list[int]]:
        sample = copy.deepcopy(self.data[index])
        if self.data_transform:
            sample = self.data_transform(sample, index)

        # Extract tools from the system message if present
        tools = None
        if sample and isinstance(sample, list) and len(sample) > 0:
            first_message = sample[0]
            if isinstance(first_message, dict) and "tools" in first_message:
                tools = first_message["tools"]

        # Pass tools to the preprocessor if available
        sample = self.preprocessor(sample, tools=tools) if tools else self.preprocessor(sample)

        item = {BatchKey.INPUT: sample.input_ids}
        if sample.label_ids:
            item[BatchKey.LABEL] = sample.label_ids
        return item

This is a bug, it is likely this was causing the problem for you.

0

Answer 3

MichaelOShea OP

2w

So I've done more research on why the context window is filling up and that research hints at something weird happening during training, which loops back to your recommendation above.

I have however just one question : with your patch, do I still need tool_calls property in the assistant message, as I indicated above?

Notice how the content property is empty in my assistant message. I was in fact passing the content in the arguments of the call to the tool in the tool_calls property. Do I now just put the content into the content property or do I just leave my training set as-is?

I will apply your fix and see how it goes.

Thanks!

0

Answer 4

carinapeng OP

2w

The patch doesn't change the tool_calls format requirements — it only fixes the bug where tools weren't being extracted from the system message during training, so you should keep the tool_calls property in your training data

1

Answer 5

MichaelOShea OP

1w

It worked perfectly! You rock! Thanks!

1

Answer 6

carinapeng OP

1w

You're welcome! I am glad!

0