Procedural Story Generation for Visual Novels Using Large Language Models and Text-to-Image Techniques
Main Article Content
Abstract
Story-based video games have content limitations. A survey conducted among Genshin Impact players, 67% of respondents had reached the end of the avail- able story. The game’s ending may lead to repetitive gameplay. To increase the story content inside video games, this research uses Generative AI for pro- ducing procedural stories. This research also measures the similarity of generated stories using a sentence similarity model. Text contents are generated using an LLM named mistralai/Mixtral-8x7B-Instruct-v0.1. Picture contents are generated using a Text-To-Image model named stabilityai/stable-diffusion- xl-base-1.0. Story similarities are measured using a sentence similarity model named sentence-transformers/all-MiniLM-L6-v2 with a similarity threshold of 73%. The visual novel application is implemented in Unity Engine 2022.3.26f1 version. The connection between Unity and the models are established through REST API provided by Hugging Face website. The result of this research is an implementation of a Visual Novel game with Generative AI-made content, as well as measuring the uniqueness level of 36,667% for low usage and up to 60% for high usage. The reduction of uniqueness level is attributed to the appearance of common genres, as well as the increasing number of comparisons required. This research shows how to implement Generative AI in a visual novel game, and shows that a Generative AI’s ability to create unique stories will decrease with an increase in game repetitions.
Article Details

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal the right of first publication. The work is simultaneously licensed under a Creative Commons Attribution 4.0 International License (CC BY 4.0), which permits others to copy, distribute, remix, adapt, and build upon the work, even commercially, provided proper attribution is given to the original authors and the journal as the source of first publication.
- Authors may enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal’s published version of the work (e.g., posting to institutional repositories, inclusion in books), with an acknowledgment of its initial publication in this journal.
- Authors are encouraged to post their work online (e.g., in institutional repositories, personal websites, or preprint servers) prior to and during the submission process. This practice can foster productive exchanges and may lead to earlier and increased citation of the published work.
User Rights:
All articles published Open Access will be immediately and permanently free for everyone to read and download. We are continuously working with our author communities to select the best choice of license options, currently being defined for this journal as follows: Creative Commons Attribution 4.0 International License.
References
[1] C. Politowski, F. Petrillo, G. C. Ullmann, and Y. G. Guéhéneuc, “Game industry problems: An extensive analysis of the gray literature,” Inf. Softw. Technol., vol. 134, 2021, doi: 10.1016/j.infsof.2021.106538.
[2] G. Freeman, N. McNeese, J. Bardzell, and S. Bardzell, “‘Pro-amateur’-driven technological innovation: Participation and challenges in indie game development,” Proc. ACM Human-Computer Interact., vol. 4, no. GROUP, 2020, doi: 10.1145/3375184.
[3] Anna Fleck, “Americans’ Favorite Video Game Genres,” statista.com. Accessed: Jul. 18, 2024. [Online]. Available: https://www.statista.com/chart/24700/favored-video-game-genres-in-the-us/
[4] C. Anna, “Genshin Impact Needs to Address its Problem With Content Droughts,” gamerant.com. Accessed: Nov. 17, 2024. [Online]. Available: https://gamerant.com/genshin-impact-content-droughts/
[5] R. Xu, Y. Huang, X. Chen, and L. Zhang, “Specializing Small Language Models Towards Complex Style Transfer via Latent Attribute Pre-Training,” Front. Artif. Intell. Appl., vol. 372, no. 3, pp. 2802–2809, 2023, doi: 10.3233/FAIA230591.
[6] X. Peng, T. Liu, and Y. Wang, “Genshin: General Shield for Natural Language Processing with Large Language Models,” 2024, [Online]. Available: http://arxiv.org/abs/2405.18741
[7] M. Greting, X. Mao, and M. P. Eladhari, “What Inspires Retellings - A Study of the Game Genshin Impact,” Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 13762 LNCS, pp. 249–269, 2022, doi: 10.1007/978-3-031-22298-6_16.
[8] C. Li et al., “ChatHaruhi: Reviving Anime Character in Reality via Large Language Model,” 2023, [Online]. Available: http://arxiv.org/abs/2308.09597
[9] D. Ronaldo and C. H. Karjo, “A semantic study of ‘warfare’ video game terminologies,” AIP Conf. Proc., vol. 2594, no. January, 2023, doi: 10.1063/5.0109937.
[10] U. Mittal, S. Sai, V. Chamola, and Devika, “A Comprehensive Review on Generative AI for Education,” IEEE Access, no. August, 2024, doi: 10.1109/ACCESS.2024.3468368.
[11] J. Lee, S. Eom, and J. Lee, “Empowering Game Designers With Generative Ai,” Iadis Int. J. Comput. Sci. Inf. Syst., vol. 18, no. 2, pp. 213–230, 2023, doi: 10.33965/ijcsis_2023180213.
[12] Y. Sun, Z. Li, K. Fang, C. H. Lee, and A. Asadipour, “Language as Reality: A Co-creative Storytelling Game Experience in 1001 Nights Using Generative AI,” Proc. - AAAI Artif. Intell. Interact. Digit. Entertain. Conf. AIIDE, vol. 19, no. 1, pp. 425–434, 2023, doi: 10.1609/aiide.v19i1.27539.
[13] M. Shidiq, “the Use of Artificial Intelligence-Based Chat-Gpt and Its Challenges for the World of Education; From the Viewpoint of the Development of Creative Writing Skills,” Soc. Humanit., vol. 01, no. 01, p. 2023, 2023.
[14] C. Galli, N. Donos, and E. Calciolari, “Performance of 4 Pre-Trained Sentence Transformer Models in the Semantic Query of a Systematic Review Dataset on Peri-Implantitis,” Inf., vol. 15, no. 2, 2024, doi: 10.3390/info15020068.
[15] [15] R. Kurniawan and G. N. Windari, “Perancangan Game Visual Novel ‘ the Adventure of Kabayan ’ Sebagai Media Belajar Bahasa Inggris Untuk Toefl,” J. Masy. Inform. Indones., vol. 4, no. 28, pp. 5–10, 2019.
[16] A. Bandi, P. V. S. R. Adapa, and Y. E. V. P. K. Kuchi, “The Power of Generative AI: A Review of Requirements, Models, Input–Output Formats, Evaluation Metrics, and Challenges,” Futur. Internet, vol. 15, no. 8, 2023, doi: 10.3390/fi15080260.
[17] F. Fui-Hoon Nah, R. Zheng, J. Cai, K. Siau, and L. Chen, “Generative AI and ChatGPT: Applications, challenges, and AI-human collaboration,” J. Inf. Technol. Case Appl. Res., vol. 25, no. 3, pp. 277–304, 2023, doi: 10.1080/15228053.2023.2233814.
[18] F. J. García-Peñalvo and A. Vázquez-Ingelmo, “What Do We Mean by GenAI? A Systematic Mapping of The Evolution, Trends, and Techniques Involved in Generative AI,” Int. J. Interact. Multimed. Artif. Intell., vol. 8, no. 4, pp. 7–16, 2023, doi: 10.9781/ijimai.2023.07.006.
[19] M. Muller, L. B. Chilton, A. Kantosalo, M. Lou Maher, C. P. Martin, and G. Walsh, “GenAICHI: Generative AI and HCI,” Conf. Hum. Factors Comput. Syst. - Proc., 2022, doi: 10.1145/3491101.3503719.
[20] M. Usman Hadi et al., “Large Language Models: A Comprehensive Survey of its Applications, Challenges, Limitations, and Future Prospects,” Authorea Prepr., 2024, [Online]. Available: https://www.authorea.com/users/618307/articles/682263-large-language-models-a-comprehensive-survey-of-its-applications-challenges-limitations-and-future-prospects
[21] C. Xu and J. McAuley, “A Survey on Model Compression and Acceleration for Pretrained Language Models,” Proc. 37th AAAI Conf. Artif. Intell. AAAI 2023, vol. 37, pp. 10566–10575, 2023, doi: 10.1609/aaai.v37i9.26255.
[22] M. Raghu, T. Unterthiner, S. Kornblith, C. Zhang, and A. Dosovitskiy, “Do Vision Transformers See Like Convolutional Neural Networks?,” Adv. Neural Inf. Process. Syst., vol. 15, no. NeurIPS, pp. 12116–12128, 2021.
[23] J. Maurício, I. Domingues, and J. Bernardino, “Comparing Vision Transformers and Convolutional Neural Networks for Image Classification: A Literature Review,” Appl. Sci., vol. 13, no. 9, 2023, doi: 10.3390/app13095521.
[24] L. Xu, W. Ouyang, M. Bennamoun, F. Boussaid, and D. Xu, “Multi-class Token Transformer for Weakly Supervised Semantic Segmentation,” Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., vol. 2022-June, pp. 4300–4309, 2022, doi: 10.1109/CVPR52688.2022.00427.
[25] T. Nayak and H. T. Ng, “Effective modeling of encoder-decoder architecture for joint entity and relation extraction,” AAAI 2020 - 34th AAAI Conf. Artif. Intell., pp. 8528–8535, 2020, doi: 10.1609/aaai.v34i05.6374.
[26] J. Ning et al., “All in Tokens: Unifying Output Space of Visual Tasks via Soft Token,” Proc. IEEE Int. Conf. Comput. Vis., pp. 19843–19853, 2023, doi: 10.1109/ICCV51070.2023.01822.
[27] M. S. Ryoo, A. J. Piergiovanni, A. Arnab, M. Dehghani, and A. Angelova, “TokenLearner: Adaptive Space-Time Tokenization for Videos,” Adv. Neural Inf. Process. Syst., vol. 16, no. NeurIPS, pp. 12786–12797, 2021.
[28] J. Seo, S. Lee, L. Liu, and W. Choi, “TA-SBERT: Token Attention Sentence-BERT for Improving Sentence Representation,” IEEE Access, vol. 10, pp. 39119–39128, 2022, doi: 10.1109/ACCESS.2022.3164769.
[29] A. Gao, Prompt Engineering for Large Language Models. Springer Nature Singapore, 2023. doi: 10.2139/ssrn.4504303.
[30] S. Lim and R. Schmälzle, “Artificial intelligence for health message generation: an empirical study using a large language model (LLM) and prompt engineering,” Front. Commun., vol. 8, 2023, doi: 10.3389/fcomm.2023.1129082.