Generative artificial intelligence(AI)technology has achieved remarkable breakthroughs and advances in its intelligence level since the release of ChatGPT several months ago,especially in terms of its scope,automation,and intelligence. The rising popularity of generative AI attracts capital inflows and promotes the innovation of various fields. Moreover,governments worldwide pay considerable attention to generative AI and hold different attitudes toward it. The US government maintains a relatively relaxed attitude to stay ahead in the global technological arena,while European countries are conservative and are concerned about data privacy in large language models(LLMs). The Chinese government attaches great importance to AI and LLMs but also emphasizes the regulatory issues. With the growing influence of ChatGPT and its competitors and the rapid development of generative AI technology,conducting a deep analysis of them becomes necessary. This paper first provides an in-depth analysis of the development,application,and prospects of generative AI. Various types of LLMs have emerged as a series of remarkable technological products that have demonstrated versatile capabilities across multiple domains,such as education,medicine,finance,law,programming,and paper writing. These models are usually fine-tuned on the basis of general LLMs,with the aim of endowing the large models with additional domain-specific knowledge and enhanced adaptability to a specific domain. LLMs(e. g. ,GPT-4)have achieved rapid improvements in the past few months in terms of professional knowledge,reasoning,coding,credibility,security,transferability,and multimodality. Then,the technical contribution of generative AI technology is briefly introduced from four aspects:1)we review the related work on LLMs,such as GPT-4,PaLM2,ERNIE Bot,and their construction pipeline,which involves the training of base and assistant models. The base models store a large amount of linguistic knowledge,while the assistant models acquire stronger comprehension and generation capabilities after a series of fine-tuning. 2)We outline a series of public LLMs based on LLaMA,a framework for building lightweight and memory-efficient LLMs,including Alpaca,Vicuna,Koala,and Baize,as well as the key technologies for building LLMs with low memory and computation requirements,consisting of low-rank adaptation,Self-instruct,and automatic prompt engineer. 3)We summarize three types of existing mainstream image–text multimodal techniques:training additional adaptation layers to align visual modules and language models,multimodal instruction fine-tuning,and LLM serving as the center of understanding. 4)We introduce three types of LLM evaluation benchmarks based on different implementation methods,namely,manual evaluation,automatic evaluation,and LLM evaluation. Parameter optimization and fine-tuning dataset construction are crucial for the popularization and innovation of generative AI products because they can significantly reduce the training cost and computational resource consumption of LLMs while enhancing the diversity and generalization ability of LLMs. Multimodal capability is the future trend of generative AI because multimodal models have the ability to integrate information from multiple perceptual dimensions,which is consistent with human cognition. Evaluation benchmarks are the key methods to compare and constrain the models of generative AI,given that they can efficiently measure and optimize the performance and generalization ability of LLMs and reveal their strengths and limitations. In conclusion,improving parameter optimization,high-quality dataset construction,multimodal,and other technologies and establishing a unified,comprehensive,and convenient evaluation benchmark will be the key to achieving further development in generative AI. Furthermore,the current challenges and possible future directions of the related technologies are discussed in this paper. Existing generative AI products have considerable creativity,understanding,and intelligence and have shown broad application prospects in various fields,such as empowering content creation,innovating interactive experience,creating“digital life,”serving as smart home and family assistants,and realizing autonomous driving and intelligent car interaction. However,LLMs still exhibit some limitations,such as lack of high-quality training data,susceptibility to hallucinations,output factual errors,uninterpretability,high training and deployment costs,and security and privacy issues. Therefore,the potential research directions can be divided into three aspects:1)the data aspect focuses on the input and output of LLMs,including the construction of general tuning instruction datasets and domain-specific knowledge datasets. 2)The technical aspect improves the internal structure and function of LLMs,including the training,multimodality,principle innovation,and structure pruning of LLMs. 3)The application aspect enhances the practical effect and application value of LLMs,including security enhancement,evaluation system development,and LLM application engineering implementation. The advancement of generative AI has provided remarkable benefits for economic development. However,it also entails new opportunities and challenges for various stakeholders,especially the industry and the general public. On the one hand,the industry needs to foster a large pool of researchers who can conduct systematic and cutting-edge research on generative AI technologies,which are constantly improving and innovating. On the other hand,the general public needs to acquire and apply the skills of prompt engineering,which can enable them to utilize existing LLMs effectively and efficiently.
CITATION STYLE
Yan, H., Liu, Y., Jin, L., & Bai, X. (2023). The development,application,and future of LLM similar to ChatGPT. Journal of Image and Graphics, 28(9), 2749–2762. https://doi.org/10.11834/jig.230536
Mendeley helps you to discover research relevant for your work.