Terminologies and Methods in Large Language Models
Jan. 15, 2024
- LLM (Large Language Model)
- NLP (Natural Language Processing)
- NLU (Natural Language Understanding) or NLI (Natural Language Interpretation)
- NLG (Natural Language Generation)
- Token
- Tokens are the basic units of data processed by LLMs. In the context of text, a token can be a word, part of a word (subword), or even a character — depending on the tokenization process. The Building Blocks of LLMs: Vectors, Tokens and Embeddings.
- RL (Reinforcement Learning)
- BERT (Bidirectional Encoder Representations from Transformers)
- SFT (Supervised Fine-Tuning)
- RLHF (Reinforcement Learning from Human Feedback)
- ChatGPT: Training.
- RLHF + Reward Model + PPO on LLMs.
- RM (Reward Model)
- PPO (Proximal Policy Optimizer)
- SSF (Scaling & Shifting Your Features)
- AIGC (Artificial Intelligence Generated Content)
- AI-Generated Content and ChatGPT: A Complete Guide.
- By comparison to UGC (User Generated Content), Professionally Generated Content (PGC), and OGC (Occupationally Generated Content): Full article: New perspective on UGC, PGC, and OGC: motivating factors of Chinese co-creators’ engagement with international television series.
- Hallucination in AI (or artificial hallucination, confabulation, delusion): In NLP, a hallucination is often defined as “generated content that appears factual but is ungrounded”. This term draws a loose analogy with human psychology (Hallucination), where hallucination typically involves false percepts.
- ChatGPT.
- Why large language models like ChatGPT are bullshit artists.
- Hallucination (artificial intelligence).
- Tonmoy, S. M., et al. “A comprehensive survey of hallucination mitigation techniques in large language models.” arXiv preprint arXiv:2401.01313 (2024), available at: [2401.01313] A Comprehensive Survey of Hallucination Mitigation Techniques in Large Language Models.
- AI copilot
- AI copilot is a conversational interface that uses large language models (LLMs) to support users in various tasks and decision-making processes across multiple domains within an enterprise environment. By leveraging LLMs, AI copilots possess the capability to understand, analyze, and process vast amounts of data. AI Copilots: What are they and how do they work?.
- AI copilots offer a new and different opportunity to reimagine the value that software can deliver to users. Unlike earlier chatbots, AI copilots use powerful LLMs like GPT-4 to answer questions, provide insights, and assist users across the application. They deeply understand how to use an application’s data and APIs and can be integrated across an application as both a conversational copilot and copilot-powered features and services. What is an AI copilot?.
- AI Alignment
- Alignment is the process of encoding human values and goals into large language models to make them as helpful, safe, and reliable as possible. Through alignment, enterprises can tailor AI models to follow their business rules and policies. What is AI alignment?.
- Evaluating AI alignment: If AI progress continues, AI systems will eventually possess highly dangerous capabilities. Before training and deploying such systems, we need methods to assess their propensity to use these capabilities. Purely behavioral evaluations may fail for advanced AI systems: Similar to humans, they might behave differently under evaluation, faking alignment. Managing extreme AI risks amid rapid progress, Yoshua Bengio, Geoffrey Hinton, and Andrew Yao et al. Managing extreme AI risks amid rapid progress.
- In the opposite direction, there is ChatGPT DAN (Do Anything Now) mode (that is jailbreak version designed to test the limits of ChatGPT). How to Use ChatGPT Dan.
- 比飓风更可怕的,是一张AI生成图片.
- LoRA (Low-Rank Adaptation)
- Low-Rank Adaptation aka LoRA is a technique used to finetuning LLMs in a parameter efficient way. This doesn’t involve finetuning whole of the base model, which can be huge and cost a lot of time and money. LoRA, instead adds a small number of trainable parameters to the model while keeping the original model parameters frozen. A beginners guide to fine tuning LLM using LoRA.
- QLoRA: [2305.14314] QLoRA: Efficient Finetuning of Quantized LLMs.
- DeepSpeed**: microsoft/DeepSpeed: DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective..
- Megatron-LM
- Megatron-LM serves as a ressearch-oriented framework leveraging Megatron-Core for large language model (LLM) training. NVIDIA/Megatron-LM.
- FlashAttention
- Dao, Tri, et al. “Flashattention: Fast and memory-efficient exact attention with io-awareness.” Advances in Neural Information Processing Systems 35 (2022): 16344-16359, available at: Dao-AILab/flash-attention.
- Dao-AILab/flash-attention.
- Direct Preference Optimization (DPO)
- Direct Preference Optimization (DPO) is a stable, performant, and computationally lightweight, technique for aligning LLM’s with a simple classification loss. DPO eliminates the need for sampling from the LM during fine-tuning or performing significant hyperparameter tuning. Aligning LLMs with Direct Preference Optimization (DPO)— background, overview, intuition and paper summary.
- Rafailov, Rafael, et al. “Direct preference optimization: Your language model is secretly a reward model.” Advances in Neural Information Processing Systems 36 (2024), available at: [2305.18290] Direct Preference Optimization: Your Language Model is Secretly a Reward Model.