Generative AI on AWS
Building Context-Aware Multimodal Reasoning Applications
Paperback Engels 2023 1e druk 9781098159221Samenvatting
Companies today are moving rapidly to integrate generative AI into their products and services. But there's a great deal of hype (and misunderstanding) about the impact and promise of this technology. With this book, Chris Fregly, Antje Barth, and Shelbee Eigenbrode from AWS help CTOs, ML practitioners, application developers, business analysts, data engineers, and data scientists find practical ways to use this exciting new technology.
You'll learn the generative AI project life cycle including use case definition, model selection, model fine-tuning, retrieval-augmented generation, reinforcement learning from human feedback, and model quantization, optimization, and deployment. And you'll explore different types of models including large language models (LLMs) and multimodal models such as Stable Diffusion for generating images and Flamingo/IDEFICS for answering questions about images.
- Apply generative AI to your business use cases
- Determine which generative AI models are best suited to your task
- Perform prompt engineering and in-context learning
- Fine-tune generative AI models on your datasets with low-rank adaptation (LoRA)
- Align generative AI models to human values with reinforcement learning from human feedback (RLHF)
- Augment your model with retrieval-augmented generation (RAG)
- Explore libraries such as LangChain and ReAct to develop agents and actions
- Build generative AI applications with Amazon Bedrock
Specificaties
Lezersrecensies
Inhoudsopgave
Conventions Used in This Book
Using Code Examples
O’Reilly Online Learning
How to Contact Us
Acknowledgments
Chris
Antje
Shelbee
1. Generative AI Use Cases, Fundamentals, and Project Life Cycle
Use Cases and Tasks
Foundation Models and Model Hubs
Generative AI Project Life Cycle
Generative AI on AWS
Why Generative AI on AWS?
Building Generative AI Applications on AWS
Summary
2. Prompt Engineering and In-Context Learning
Prompts and Completions
Tokens
Prompt Engineering
Prompt Structure
Instruction
Context
In-Context Learning with Few-Shot Inference
Zero-Shot Inference
One-Shot Inference
Few-Shot Inference
In-Context Learning Gone Wrong
In-Context Learning Best Practices
Prompt-Engineering Best Practices
Inference Configuration Parameters
Summary
3. Large-Language Foundation Models
Large-Language Foundation Models
Tokenizers
Embedding Vectors
Transformer Architecture
Inputs and Context Window
Embedding Layer
Encoder
Self-Attention
Decoder
Softmax Output
Types of Transformer-Based Foundation Models
Pretraining Datasets
Scaling Laws
Compute-Optimal Models
Summary
4. Memory and Compute Optimizations
Memory Challenges
Data Types and Numerical Precision
Quantization
fp16
bfloat16
fp8
int8
Optimizing the Self-Attention Layers
FlashAttention
Grouped-Query Attention
Distributed Computing
Distributed Data Parallel
Fully Sharded Data Parallel
Performance Comparison of FSDP over DDP
Distributed Computing on AWS
Fully Sharded Data Parallel with Amazon SageMaker
AWS Neuron SDK and AWS Trainium
Summary
5. Fine-Tuning and Evaluation
Instruction Fine-Tuning
Llama 2-Chat
Falcon-Chat
FLAN-T5
Instruction Dataset
Multitask Instruction Dataset
FLAN: Example Multitask Instruction Dataset
Prompt Template
Convert a Custom Dataset into an Instruction Dataset
Instruction Fine-Tuning
Amazon SageMaker Studio
Amazon SageMaker JumpStart
Amazon SageMaker Estimator for Hugging Face
Evaluation
Evaluation Metrics
Benchmarks and Datasets
Summary
6. Parameter-Efficient Fine-Tuning
Full Fine-Tuning Versus PEFT
LoRA and QLoRA
LoRA Fundamentals
Rank
Target Modules and Layers
Applying LoRA
Merging LoRA Adapter with Original Model
Maintaining Separate LoRA Adapters
Full-Fine Tuning Versus LoRA Performance
QLoRA
Prompt Tuning and Soft Prompts
Summary
7. Fine-Tuning with Reinforcement Learning from Human Feedback
Human Alignment: Helpful, Honest, and Harmless
Reinforcement Learning Overview
Train a Custom Reward Model
Collect Training Dataset with Human-in-the-Loop
Sample Instructions for Human Labelers
Using Amazon SageMaker Ground Truth for Human Annotations
Prepare Ranking Data to Train a Reward Model
Train the Reward Model
Existing Reward Model: Toxicity Detector by Meta
Fine-Tune with Reinforcement Learning from Human Feedback
Using the Reward Model with RLHF
Proximal Policy Optimization RL Algorithm
Perform RLHF Fine-Tuning with PPO
Mitigate Reward Hacking
Using Parameter-Efficient Fine-Tuning with RLHF
Evaluate RLHF Fine-Tuned Model
Qualitative Evaluation
Quantitative Evaluation
Load Evaluation Model
Define Evaluation-Metric Aggregation Function
Compare Evaluation Metrics Before and After
Summary
8. Model Deployment Optimizations
Model Optimizations for Inference
Pruning
Post-Training Quantization with GPTQ
Distillation
Large Model Inference Container
AWS Inferentia: Purpose-Built Hardware for Inference
Model Update and Deployment Strategies
A/B Testing
Shadow Deployment
Metrics and Monitoring
Autoscaling
Autoscaling Policies
Define an Autoscaling Policy
Summary
9. Context-Aware Reasoning Applications Using RAG and Agents
Large Language Model Limitations
Hallucination
Knowledge Cutoff
Retrieval-Augmented Generation
External Sources of Knowledge
RAG Workflow
Document Loading
Chunking
Document Retrieval and Reranking
Prompt Augmentation
RAG Orchestration and Implementation
Document Loading and Chunking
Embedding Vector Store and Retrieval
Retrieval Chains
Reranking with Maximum Marginal Relevance
Agents
ReAct Framework
Program-Aided Language Framework
Generative AI Applications
FMOps: Operationalizing the Generative AI Project Life Cycle
Experimentation Considerations
Development Considerations
Production Deployment Considerations
Summary
10. Multimodal Foundation Models
Use Cases
Multimodal Prompt Engineering Best Practices
Image Generation and Enhancement
Image Generation
Image Editing and Enhancement
Inpainting, Outpainting, Depth-to-Image
Inpainting
Outpainting
Depth-to-Image
Image Captioning and Visual Question Answering
Image Captioning
Content Moderation
Visual Question Answering
Model Evaluation
Text-to-Image Generative Tasks
Forward Diffusion
Nonverbal Reasoning
Diffusion Architecture Fundamentals
Forward Diffusion
Reverse Diffusion
U-Net
Stable Diffusion 2 Architecture
Text Encoder
U-Net and Diffusion Process
Text Conditioning
Cross-Attention
Scheduler
Image Decoder
Stable Diffusion XL Architecture
U-Net and Cross-Attention
Refiner
Conditioning
Summary
11. Controlled Generation and Fine-Tuning with Stable Diffusion
ControlNet
Fine-Tuning
DreamBooth
DreamBooth and PEFT-LoRA
Textual Inversion
Human Alignment with Reinforcement Learning from Human Feedback
Summary
12. Amazon Bedrock: Managed Service for Generative AI
Bedrock Foundation Models
Amazon Titan Foundation Models
Stable Diffusion Foundation Models from Stability AI
Bedrock Inference APIs
Large Language Models
Generate SQL Code
Summarize Text
Embeddings
Fine-Tuning
Agents
Multimodal Models
Create Images from Text
Create Images from Images
Data Privacy and Network Security
Governance and Monitoring
Summary
Index
About the Authors
Rubrieken
- advisering
- algemeen management
- coaching en trainen
- communicatie en media
- economie
- financieel management
- inkoop en logistiek
- internet en social media
- it-management / ict
- juridisch
- leiderschap
- marketing
- mens en maatschappij
- non-profit
- ondernemen
- organisatiekunde
- personal finance
- personeelsmanagement
- persoonlijke effectiviteit
- projectmanagement
- psychologie
- reclame en verkoop
- strategisch management
- verandermanagement
- werk en loopbaan