Guidelines to Prompt Large Language Models for Code Generation: An Empirical Characterization
Abstract
Research derives and evaluates prompt optimization guidelines for code generation tasks in software engineering, identifying 10 specific improvement patterns related to input/output specification, conditions, examples, and clarity.
Large Language Models (LLMs) are nowadays extensively used for various types of software engineering tasks, primarily code generation. Previous research has shown how suitable prompt engineering could help developers in improving their code generation prompts. However, so far, there do not exist specific guidelines driving developers towards writing suitable prompts for code generation. In this work, we derive and evaluate development-specific prompt optimization guidelines. First, we use an iterative, test-driven approach to automatically refine code generation prompts, and we analyze the outcome of this process to identify prompt improvement items that lead to test passes. We use such elements to elicit 10 guidelines for prompt improvement, related to better specifying I/O, pre-post conditions, providing examples, various types of details, or clarifying ambiguities. We conduct an assessment with 50 practitioners, who report their usage of the elicited prompt improvement patterns, as well as their perceived usefulness, which does not always correspond to the actual usage before knowing our guidelines. Our results lead to implications not only for practitioners and educators, but also for those aimed at creating better LLM-aided software development tools.
Community
.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Toward Explaining Large Language Models in Software Engineering Tasks (2025)
- Reporting LLM Prompting in Automated Software Engineering: A Guideline Based on Current Practices and Expectations (2026)
- Holistic Evaluation of State-of-the-Art LLMs for Code Generation (2025)
- Understanding Specification-Driven Code Generation with LLMs: An Empirical Study Design (2026)
- Understanding LLM-Driven Test Oracle Generation (2025)
- From Human to Machine Refactoring: Assessing GPT-4's Impact on Python Class Quality and Readability (2026)
- CIFE: Code Instruction-Following Evaluation (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper