Our paper, Using Large Language Models to Generate JUnit Tests: An Empirical Study, got accepted for the 28th International Conference on Evaluation and Assessment in Software Engineering (EASE 2024) in the research track. In this work, we analyzed three models with different prompting techniques to generate JUnit tests for the HumanEval dataset and real-world software. We evaluated the LLMs’ generated tests using compilation rates, test correctness, test coverage, and test smell. We found that though the models have higher coverage for small programming problems from the HumanEval dataset, they lack good coverage for real-world software from the Evosuite dataset.
Subscribe to this blog via RSS.
Paper 12
Research 12
Tool 2
Llm 9
Dataset 2
Survey 1
"SALLM: Security Assessment of Generated Code" accepted at ASYDE 2024 (ASE Workshop)
Posted on 07 Sep 2024Paper (12) Research (12) Tool (2) Llm (9) Dataset (2) Qualitative-analysis (1) Survey (1)