GeoCoder: Enhancing Geometric Reasoning in Vision-Language Models through Modular Code-Finetuning and Retrieval-Augmented Memory
Geometry problem-solving relies heavily on advanced reasoning skills to interpret visual inputs, process questions, and apply mathematical formulas accurately. Although vision-language models (VLMs) have shown progress in multimodal tasks, they still face significant limitations with […]
