Register
Login
Resources
Docs Blog Datasets Glossary Case Studies Tutorials & Webinars
Product
Data Engine LLMs Platform Enterprise
Pricing Explore
Connect to our Discord channel

llava_med_performance.md 1023 B

You have to be logged in to leave a comment. Sign In

LLaVA-Med Performance


Performance comparison of mulitmodal chat instruction-following abilities, measured by the relative score via language GPT-4 evaluation.


Example 1: comparison of medical visual chat. The language-only GPT-4 is considered as the performance upper bound, as the golden captions and inline mentions are fed into GPT-4 as the context, without requiring the model to understand the raw image.


Example 2: comparison of medical visual chat. LLaVA tends to halluciate or refuse to provide domain-specific knowledgable response.


Performance comparison of fine-tuned LLaVA-Med on established Medical QVA datasets.

Tip!

Press p or to see the previous file or, n or to see the next file

Comments

Loading...