PILLOW: Enhancing Efficient Instruction Fine-tuning via Prompt Matching

4Citations
Citations of this article
10Readers
Mendeley users who have this article in their library.

Abstract

Instruction fine-tuning has conventionally been employed to adapt Large Language Models (LLMs) to a variety of tasks. Nonetheless, this technique often necessitates substantial computational resources, making it impractical for deployment by individuals or small-scale entities. Recently, Low-Rank Adaptation (LoRA) has become a promising alternative, offering high capabilities on par with full tuning with reduced resource overhead. However, attaining satisfactory performance through the fine-tuning of LoRA is a non-trivial challenge. In this paper, we propose PILLOW, which aims to improve LoRA's performance by a discrimination-based prompting method, leveraging LLMs' In-Context Learning ability. PILLOW incorporates a matching network that selects prompts from a user-defined prompt pool, concatenates the selected prompts with the user instruction as input, and performs inference using the LoRA-fine-tuned LLMs. Trained with Reinforcement Learning, PILLOW exhibits commensurate performance on various evaluation metrics compared with typical instruction fine-tuning methods, utilizing only consumer-grade GPU resources and exhibiting a large reduction in computational costs.

Cite

CITATION STYLE

APA

Qi, Z., Tan, X., Shi, S., Qu, C., Xu, Y., & Qi, Y. (2023). PILLOW: Enhancing Efficient Instruction Fine-tuning via Prompt Matching. In EMNLP 2023 - 2023 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Industry Track (pp. 471–482). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2023.emnlp-industry.45

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free