GitHub - NEUIR/COAST: Official repository for the paper "COAST: Enhancing the Code Debugging Ability of LLMs through Communicative Agent Based Data Synthesis".

1. Introduction

This paper presents a benchmark, DebugEval, which is used to evaluate the code debugging ability of LLMs (Large Language Models) and proposals a framework for synthesizing training data using multiple agents, COAST.

1.1 DEBUGEVAL

DebugEval designs four task scenarios: BUG Localization, BUG Identification, Code Repair, and Code Recognition to comprehensively evaluate the code debugging capability of LLMs.

.

1.2 COAST

COAST is a framework for making use of multiple agents working together to synthesize training data to improve code debugging capability of LLMs.

.

2. Installation

You can clone the repository using the following command:

git clone https://github.com/NEUIR/COAST
cd COAST

3. Inference and Evaluation

Download the dataset we provide.

cd src

Please refer to src/README.md for more details.

4. Fine-Tuning

We use DeepSeek-Coder-6.7B-Ins and Llama3-8B-Ins as the base model, and train the models with COAST framework.

4.1 For DeepSeek-Coder-6.7B-Ins

cd neural_compiler

Please refer to neural_compiler/README.md for more details.

4.2 For Llama3-8B-Ins

cd LLaMA-Factory

Please refer to LLaMA-Factory/README.md for more details.

We provide the trained NeuDebugger models.

5. Result

6. Citation

Please cite the paper and star the repo if you use DebugEval and find it helpful.

Feel free to contact 2301983@stu.neu.edu.cn or open an issue if you have any questions.

@misc{yang2025coastenhancingcodedebugging,
      title={COAST: Enhancing the Code Debugging Ability of LLMs through Communicative Agent Based Data Synthesis}, 
      author={Weiqing Yang and Hanbin Wang and Zhenghao Liu and Xinze Li and Yukun Yan and Shuo Wang and Yu Gu and Minghe Yu and Zhiyuan Liu and Ge Yu},
      year={2025},
      eprint={2408.05006},
      archivePrefix={arXiv},
      primaryClass={cs.SE},
      url={https://arxiv.org/abs/2408.05006}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 171 Commits
Figure		Figure
LLaMA-Factory		LLaMA-Factory
OJ_Evaluation		OJ_Evaluation
neural_compiler/src		neural_compiler/src
src		src
README.md		README.md
bug_iden_calculate_acc.py		bug_iden_calculate_acc.py
bug_loc_calculate_acc.py		bug_loc_calculate_acc.py
code_rep_calculate_acc.py		code_rep_calculate_acc.py
code_rev_calculate_acc.py		code_rev_calculate_acc.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

1. Introduction

1.1 DEBUGEVAL

DebugEval designs four task scenarios: BUG Localization, BUG Identification, Code Repair, and Code Recognition to comprehensively evaluate the code debugging capability of LLMs.

1.2 COAST

COAST is a framework for making use of multiple agents working together to synthesize training data to improve code debugging capability of LLMs.

2. Installation

3. Inference and Evaluation

4. Fine-Tuning

4.1 For DeepSeek-Coder-6.7B-Ins

4.2 For Llama3-8B-Ins

5. Result

6. Citation

About

Releases

Packages

Contributors 2

Languages

NEUIR/COAST

Folders and files

Latest commit

History

Repository files navigation

1. Introduction

1.1 DEBUGEVAL

DebugEval designs four task scenarios: BUG Localization, BUG Identification, Code Repair, and Code Recognition to comprehensively evaluate the code debugging capability of LLMs.

1.2 COAST

COAST is a framework for making use of multiple agents working together to synthesize training data to improve code debugging capability of LLMs.

2. Installation

3. Inference and Evaluation

4. Fine-Tuning

4.1 For DeepSeek-Coder-6.7B-Ins

4.2 For Llama3-8B-Ins

5. Result

6. Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages