Skip to content

Fix inlineData and text not handled correctly for Gemini#684

Open
bubiche wants to merge 2 commits into
crmne:mainfrom
bubiche:nguyen/gemini_response_attachment
Open

Fix inlineData and text not handled correctly for Gemini#684
bubiche wants to merge 2 commits into
crmne:mainfrom
bubiche:nguyen/gemini_response_attachment

Conversation

@bubiche

@bubiche bubiche commented Mar 17, 2026

Copy link
Copy Markdown
Contributor

What this does

  • Gemini/Vertex AI can response with text and inlineData together, e.g. {"candidates": [{"content": {"role": "model", "parts": [{"text": "Here is your image! "}, {"inlineData": {"mimeType": "image/png", "data": ... }}]}. The current code doesn't handle inlineData because extract_text_parts would just get the text and be done with it.
    => Fix it by making parse_content the only code path, it'll filter out non-user-visible stuff like thought and functionCall and build content correctly.
  • I think there's also a bug inRubyLLM::Providers::Gemini::Media's build_response_content, Content.new(text:, attachments:) expects attachment sources (paths/IO/URLs), not RubyLLM::Attachment instances
    => Fix it to construct Content correctly too.

Type of change

  • Bug fix
  • New feature
  • Breaking change
  • Documentation
  • Performance improvement

Scope check

  • I read the Contributing Guide
  • This aligns with RubyLLM's focus on LLM communication
  • This isn't application-specific logic that belongs in user code
  • This benefits most users, not just my specific use case

Quality check

  • I ran overcommit --install and all hooks pass
  • I tested my changes thoroughly
    • For provider changes: Re-recorded VCR cassettes with bundle exec rake vcr:record[provider_name]
    • All tests pass: bundle exec rspec
  • I updated documentation if needed
  • I didn't modify auto-generated files manually (models.json, aliases.json)

AI-generated code

  • I used AI tools to help write this code
  • I have reviewed and understand all generated code (required if above is checked)

API changes

  • Breaking change
  • New public methods/classes
  • Changed method signatures
  • No API changes

@bubiche

bubiche commented Mar 31, 2026

Copy link
Copy Markdown
Contributor Author

Hi @crmne, sorry for the trouble, can you help take a look at this PR and see if it makes sense please? This bug makes using gemini with rich output with ruby_llm a bit iffy.

@losnikitos

Copy link
Copy Markdown

Facing same problem, I was about to suggest a fix and came across this PR. It would be great to land it. Gemini API indeed returns a multipart message with text and media, and RubyLLM parses it as just text:

 [{"content"=>
    {"parts"=>
      [{"text"=>"Here you go! "},  <-- Text
       {"inlineData"=>
         {"mimeType"=>"image/png",
          "data"=>
           "[BASE64 DATA]"}}],     <-- Image
     "role"=>"model"},
   "finishReason"=>"STOP",
   "index"=>0}],
"usageMetadata"=> ...

Here
Screenshot 2026-04-06 at 21 05 22

@codecov

codecov Bot commented May 12, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 87.49%. Comparing base (4942d6c) to head (f155c4b).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #684      +/-   ##
==========================================
+ Coverage   87.05%   87.49%   +0.44%     
==========================================
  Files         119      119              
  Lines        5594     5591       -3     
  Branches     1407     1405       -2     
==========================================
+ Hits         4870     4892      +22     
+ Misses        724      699      -25     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants