"Large Language Models in Computer Science Education: A Systematic Literature Review" accepted at SIGCSE TS 2025

Jan 23, 2024. | By: Joanna C. S. Santos

Our paper, “MojoBench: Language Modeling and Benchmarks for Mojo”, has been accepted to NAACL 2025 Findings.

In this paper, we introduce MojoBench, the first framework for Mojo code generation. MojoBench includes HumanEval-Mojo, a benchmark dataset designed for evaluating code LLMs on Mojo, and Mojo-Coder, the first LLM pretrained and finetuned for Mojo code generation, which supports instructions in 5 natural languages (NLs). Our results show that Mojo-Coder achieves a 30-35% performance improvement over leading models like GPT-4o and Claude-3.5-Sonnet. Furthermore, we provide insights into LLM behavior with underrepresented and unseen PLs, offering potential strategies for enhancing model adaptability. MojoBench contributes to our understanding of LLM capabilities and limitations in emerging programming paradigms fostering more robust code generation systems.

BibTeX

@inproceedings{raihan2025mojobench,
  title={MojoBench: Language Modeling and Benchmarks for Mojo},
  author={Raihan, Nishat and Santos, Joanna C. S. and Zampieri, Marcos},
  booktitle={Findings of the Association for Computational Linguistics: NAACL 2025},
  year={2025}
}