Best Practices for Generative AI with LLMs on a Supercomputer

Date:

Generative AI with LLMs refers to the use of large language models like GPT-3 for generating human-like content, spanning text, images and even code. LLMs are trained on a vast amount of data and code, and usually carefully prompt-engineered or fine-tuned to suit specific downstream tasks such as Chatbots, Translation, Question Answering and Summarization. The contents and python codes of this seminar are mainly originated from the 16-hour “Generative AI with LLMs” course offered by the DeepLearning.AI. This talk will cover the key concepts and practices of a typical LLM-powered Generative AI lifecycle, from data gathering and model selection, to instruction fine-tuning and RLHF-based alignment to human preference, to performance evaluation and deployment. I will show a short demo on how users can create a conda virtual environment on the KISTI Neuron cluster with 260 GPUs, launch a Jupyter server on a compute node and have access it to from his/her own PC or Labtop for Genera tive AI practices on a supercomputer. The demo will illustrate how to conduct LLM practices including prompting and prompt engineering, and instruction fine-tuning and parameter-efficient fine-tuning (PEFT) with LoRA, and evaluation and benchmark on LLMs.