Our paper, “MojoBench: Language Modeling and Benchmarks for Mojo”, has been accepted to NAACL 2025 Findings.
In this paper, we introduce MojoBench, the first framework for Mojo code generation. MojoBench includes HumanEval-Mojo, a benchmark dataset designed for evaluating code LLMs on Mojo, and Mojo-Coder, the first LLM pretrained and finetuned for Mojo code generation, which supports instructions in 5 natural languages (NLs). Our results show that Mojo-Coder achieves a 30-35% performance improvement over leading models like GPT-4o and Claude-3.5-Sonnet. Furthermore, we provide insights into LLM behavior with underrepresented and unseen PLs, offering potential strategies for enhancing model adaptability. MojoBench contributes to our understanding of LLM capabilities and limitations in emerging programming paradigms fostering more robust code generation systems.
@inproceedings{raihan2025mojobench,
title={MojoBench: Language Modeling and Benchmarks for Mojo},
author={Raihan, Nishat and Santos, Joanna C. S. and Zampieri, Marcos},
booktitle={Findings of the Association for Computational Linguistics: NAACL 2025},
year={2025}
}
Subscribe to this blog via RSS.
Paper 13
Research 13
Tool 2
Llm 10
Dataset 2
Survey 1
"SALLM: Security Assessment of Generated Code" accepted at ASYDE 2024 (ASE Workshop)
Posted on 07 Sep 2024Paper (13) Research (13) Tool (2) Llm (10) Dataset (2) Qualitative-analysis (1) Survey (1)