Plot2Code: A Comprehensive Benchmark for Evaluating Multi-modal Large Language Models in Code Generation from Scientific Plots | allainews.com

May 14, 2024, 4:47 a.m. | Chengyue Wu, Yixiao Ge, Qiushan Guo, Jiahao Wang, Zhixuan Liang, Zeyu Lu, Ying Shan, Ping Luo

cs.CV updates on arXiv.org arxiv.org

arXiv:2405.07990v1 Announce Type: cross
Abstract: The remarkable progress of Multi-modal Large Language Models (MLLMs) has attracted significant attention due to their superior performance in visual contexts. However, their capabilities in turning visual figure to executable code, have not been evaluated thoroughly. To address this, we introduce Plot2Code, a comprehensive visual coding benchmark designed for a fair and in-depth assessment of MLLMs. We carefully collect 132 manually selected high-quality matplotlib plots across six plot types from publicly available matplotlib galleries. For …

arxiv benchmark code code generation cs.cl cs.cv language language models large language large language models modal multi-modal plots scientific type

More from arxiv.org / cs.CV updates on arXiv.org

Having Second Thoughts? Let's hear it 11 hours ago | arxiv.org

abstract arxiv brain cognitive +20

Towards Imbalanced Motion: Part-Decoupling Network for Video Portrait Segmentation 11 hours ago | arxiv.org

abstract arxiv attention cs.cv +15

Decoupling Dynamic Monocular Videos for Dynamic View Synthesis 11 hours ago | arxiv.org

abstract arxiv challenge cs.cv +13

From CNNs to Shift-Invariant Twin Models Based on Complex Wavelets 11 hours ago | arxiv.org

abstract accuracy arxiv cnns +20

Behind Every Domain There is a Shift: Adapting Distortion-aware Vision Transformers for Panoramic Semantic Segmentation 11 hours ago | arxiv.org

arxiv cs.cv cs.ro domain +10

Self-supervised Feature-Gate Coupling for Dynamic Network Pruning 11 hours ago | arxiv.org

abstract arxiv computational cost +16

An Organic Weed Control Prototype using Directed Energy and Deep Learning 11 hours ago | arxiv.org

abstract array arxiv control +15

You Only Scan Once: Efficient Multi-dimension Sequential Modeling with LightNet 11 hours ago | arxiv.org

abstract arxiv attention attention mechanisms +20

Generative Adversarial Networks in Ultrasound Imaging: Extending Field of View Beyond Conventional Limits 11 hours ago | arxiv.org

abstract adversarial arxiv beyond +18

Senior Machine Learning Engineer

@ GPTZero | Toronto, Canada

View on ai-jobs.net

ML/AI Engineer / NLP Expert - Custom LLM Development (x/f/m)

@ HelloBetter | Remote

View on ai-jobs.net

Doctoral Researcher (m/f/div) in Automated Processing of Bioimages

@ Leibniz Institute for Natural Product Research and Infection Biology (Leibniz-HKI) | Jena

View on ai-jobs.net

Seeking Developers and Engineers for AI T-Shirt Generator Project

@ Chevon Hicks | Remote

View on ai-jobs.net

Principal Autonomy Applications

@ BHP | Chile

View on ai-jobs.net

Quant Analytics Associate - Data Visualization

@ JPMorgan Chase & Co. | Bengaluru, Karnataka, India

View on ai-jobs.net