Problem invoking function / tool when using Google_GenerativeAI.Live

Hello,

I'm not sure if my issue is related to how I'm using this library, or in Gemini itself, so please forgive me if this is misplaced.

I'm trying to build an AI agent which interacts through voice (audio stream) with a customer over a phone call. The binding with the phone call is working fine, and I can have a conversation with the AI agent, so far so good. I need to get an outcome from the AI agent when the call ends. For example, if the customer agrees to receive an email, if the customer wants to get connected to a human, etc. I need that the AI agent invokes a function named `set_call_outcome` passing as parameter the value `send_email`, `transfer_call` or `drop_call`. I'm asking this through the system instructions. The problem is that instead of invoking the function, I can "hear" the "invocation" in the generated audio. So the AI agents says the following:
`OK, You will receive an email from us. Thanks.11 set_call_outcome('send_email')`

That's part of the transcription as well.

Also, we can see that the AI agent has the intention of calling the function, because we receive this message:
```
Message received: BidiResponsePayload { SetupComplete: null, ServerContent: BidiGenerateContentServerContent { TurnComplete: null, Interrupted: null, GroundingMetadata: null, ModelTurn: Content { Parts: [Part { Text: "**Terminating the Validation Attempt**

I've determined that the customer wants to receive the email. Consequently, I've ended the session as instructed. The `set_call_outcome` function will be invoked to reflect this outcome of \"send_email\".


", InlineData: null, FunctionCall: null, FunctionResponse: null, FileData: null, ExecutableCode: null, CodeExecutionResult: null, VideoMetadata: null, Thought: True, ThoughtSignature: null }], Role: null }, GenerationComplete: null, InputTranscription: null, OutputTranscription: null, UrlContextMetadata: null, TurnCompleteReason: null, WaitingForInput: null }, ToolCall: null, ToolCallCancellation: null, GoAway: null, SessionResumptionUpdate: null, UsageMetadata: null }
```

I'm not sure if this is related to how we're using this library, or an issue in the underlying Gemini AI logic. In case it helps, here's a snippet of how I'm configuring the client and adding the function to it:
```
        var setCallOutcomeFunc = (
            (string outcome) =>
            {
                _logger.Verbose("Setting call outcome to {Outcome}", outcome);
                OnCallOutcomeAvailable?.Invoke(outcome);
                return "Call outcome set";
            }
        );

        _setCallOutcomeQT = new QuickTool(
            setCallOutcomeFunc,
            "set_call_outcome",
            "Set the call outcome after having a conversation"
        );

        _config = new()
        {
            ResponseModalities = [Modality.AUDIO],
            SpeechConfig = new SpeechConfig
            {
                LanguageCode = language,
                VoiceConfig = new() { PrebuiltVoiceConfig = new() { VoiceName = voice } },
            },
        };
        _client = new(
            platformAdapter: new GoogleAIPlatformAdapter(googleApiKey),
            modelName: model,
            config: _config
        )
        {
            UseGoogleSearch = false,
            UseCodeExecutor = true,
            InputAudioTranscriptionEnabled = true,
            OutputAudioTranscriptionEnabled = true,
            FunctionTools = [_setCallOutcomeQT],
            ToolConfig = new()
            {
                RetrievalConfig = new() { LanguageCode = language },
                FunctionCallingConfig = new()
                {
                    AllowedFunctionNames = ["set_call_outcome"],
                    Mode = FunctionCallingMode.ANY,
                },
            },
        };

        await _client.ConnectAsync(false, ct);
        await _client.SendSetupAsync(
            new BidiGenerateContentSetup()
            {
                Model = _model.ToModelId(),
                GenerationConfig = _config,
                OutputAudioTranscription = new AudioTranscriptionConfig(),
                InputAudioTranscription = new AudioTranscriptionConfig(),
                SystemInstruction = new Content(_instructions, null),
            },
            ct
        );
```

Any ideas? Am I doing something wrong when configuring the client?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Problem invoking function / tool when using Google_GenerativeAI.Live #100

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Problem invoking function / tool when using Google_GenerativeAI.Live #100

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions