Udemy - LLM Reinforcement Learning Fine-Tuning DeepSeek Method GRPO

seeders: 7

leechers: 1

Download torrent

Added 4 months ago by freecoursewb in Other

Main
Technical
Comments

Download Fast Safe Anonymous
movies, software, shows...

Files

Udemy - LLM Reinforcement Learning Fine-Tuning DeepSeek Method GRPO (Size: 1.8 GB)

		Bonus Resources.txt	102.4 B
		Get Bonus Downloads Here.url	204.8 B
		~Get Your Files Here !
		1 - Introduction
		1. Introduction.mp4	11.4 MB
		2. Course Content Introduction.mp4	47.7 MB
		3. Jupyter Notebooks.html	5.4 KB
		Notebooks 2
		Bolum_(Section)_1.ipynb	465.1 KB
		Bolum_(Section)_3_DPO.ipynb	259.4 KB
		Bolum_(Section)_4_GRPO_.ipynb	624.2 KB
		Bolum_(Section)__2.ipynb	207.9 KB
		DS_Store	6 KB
		Quantization.ipynb	81.9 KB
		Thinking__(REASONING)_model.ipynb	54.8 KB
		__MACOSX
		Notebooks 2
		2 - Quantization, LoRA, SFT, Data Collator, Data Preparation…
		10. Preparing Dataset, Chat Template, and Integrating Custom Tokens.en_US.srt	13.3 KB
		10. Preparing Dataset, Chat Template, and Integrating Custom Tokens.mp4	145.9 MB
		11. Continuing Dataset Preparation and Tokenization.en_US.srt	5.6 KB
		11. Continuing Dataset Preparation and Tokenization.mp4	47 MB
		12. What is a Data Collator How Does It Work Practical Example.en_US.srt	9.1 KB
		12. What is a Data Collator How Does It Work Practical Example.mp4	84.6 MB
		13. What is LoRA Why Use It.en_US.srt	3.4 KB
		13. What is LoRA Why Use It.mp4	17 MB
		14. Integrating LoRA Matrices into the Model.en_US.srt	7.6 KB
		14. Integrating LoRA Matrices into the Model.mp4	37.6 MB
		15. Setting Training Arguments (Training Hyperparameters).en_US.srt	9.8 KB
		15. Setting Training Arguments (Training Hyperparameters).mp4	32.1 MB
		16. Setting Trainer, Starting Training, and Evaluating Results.en_US.srt	3.9 KB
		16. Setting Trainer, Starting Training, and Evaluating Results.mp4	21.4 MB
		17. Merging Trained LoRA Matrices with the Model.en_US.srt	6.8 KB
		17. Merging Trained LoRA Matrices with the Model.mp4	51 MB
		18. Uploading Model on Hugging Face and Using it.en_US.srt	5.7 KB
		18. Uploading Model on Hugging Face and Using it.mp4	49.4 MB
		19. Hyperparameters Affecting the Outputs.en_US.srt	6.5 KB
		19. Hyperparameters Affecting the Outputs.mp4	30.3 MB
		3 - Adding New Tokens and Creating Templates for the Tokenizer
		20. Bolum_(Section)__2.ipynb.bin	207.9 KB
		20. Download the Model and Tokenizer.en_US.srt	4.6 KB
		20. Download the Model and Tokenizer.mp4	37 MB
		21. Adding New Custom Tokens to the Tokenizer.en_US.srt	8 KB
		21. Adding New Custom Tokens to the Tokenizer.mp4	30.9 MB
		22. Creating Templates with New Custom Tokens and Integrating Them into the Dataset.en_US.srt	7.7 KB
		22. Creating Templates with New Custom Tokens and Integrating Them into the Dataset.mp4	28.7 MB
		4 - DPO (Direct Preference Optimization)
		23. Bolum_(Section)_3_DPO.ipynb.bin	259.4 KB
		23. What is DPO What Data Format Does It Expect.en_US.srt	7.5 KB
		23. What is DPO What Data Format Does It Expect.mp4	43.4 MB
		24. Bolum_(Section)_3_DPO.ipynb.bin	259.4 KB
		24. Downloading Model & Understanding How the DPO Data Collator do Padding.en_US.srt	7.1 KB
		24. Downloading Model & Understanding How the DPO Data Collator do Padding.mp4	45.4 MB
		25. Preparing the Dataset for DPO.en_US.srt	10.9 KB
		25. Preparing the Dataset for DPO.mp4	84.4 MB
		26. Adding LoRA Matrices to the Model.en_US.srt	3.8 KB
		26. Adding LoRA Matrices to the Model.mp4	19.1 MB
		27. Setting Training Arguments (with DPOConfig).en_US.srt	5.4 KB
		27. Setting Training Arguments (with DPOConfig).mp4	13.3 MB
		28. Training the Model and Merging the LoRA Matrices.en_US.srt	6.9 KB
		28. Training the Model and Merging the LoRA Matrices.mp4	49.7 MB
		5 - GRPO (Group Relative Policy Optimization) Reinforcement Learning
		29. Bolum_(Section)_4_GRPO_.ipynb.bin	624.2 KB
		29. Thinking__(REASONING)_model.ipynb.bin	54.8 KB
		29. What is a “Reasoning” Model How Does It Work.en_US.srt	5 KB
		29. What is a “Reasoning” Model How Does It Work.mp4	56.5 MB
		30. What is GRPO How Is It Applied.en_US.srt	4.9 KB
		30. What is GRPO How Is It Applied.mp4	21.4 MB
		31. Bolum_(Section)_4_GRPO_.ipynb.bin	624.2 KB
		31. What are Unsloth and VLLM + Download the Model.en_US.srt	6.9 KB
		31. What are Unsloth and VLLM + Download the Model.mp4	62.7 MB
		32. Examining the Dataset and Initial Preparation Steps.en_US.srt	7.6 KB
		32. Examining the Dataset and Initial Preparation Steps.mp4	54 MB
		33. Extracting Specific Parts of Data Regex and Group Operations.en_US.srt	13.5 KB
		33. Extracting Specific Parts of Data Regex and Group Operations.mp4	49.5 MB
		34. In Which Format is Data Sent to Reward Functions.en_US.srt	7 KB
		34. In Which Format is Data Sent to Reward Functions.mp4	88.9 MB
		35. 1st Reward Function.en_US.srt	13.1 KB
		35. 1st Reward Function.mp4	64.4 MB
		36. 2nd Reward Function.en_US.srt	12.3 KB
		36. 2nd Reward Function.mp4	73.2 MB
		37. 3rd Reward Function.en_US.srt	11.1 KB
		37. 3rd Reward Function.mp4	77.9 MB
		38. 4th Reward Function.en_US.srt	7.2 KB
		38. 4th Reward Function.mp4	26.6 MB
		39. Training Hyperparameters (with GRPO Config).en_US.srt	8.3 KB
		39. Training Hyperparameters (with GRPO Config).mp4	61.3 MB
		40. Trainer Object and Training Process.en_US.srt	2.6 KB
		40. Trainer Object and Training Process.mp4	12 MB
		41. Results Table Rewards and Sample Outputs.en_US.srt	4.3 KB
		41. Results Table Rewards and Sample Outputs.mp4	78.4 MB
		42. BONUS_New_GRPO_Notebook.html	7.1 KB
		42. SFT_GRPO_Training.ipynb.bin	10 MB
		6 - BONUS_New_GRPO_Notebook
		43. BONUS_New_GRPO_Notebook.html	7.1 KB
		43. SFT_GRPO_Training.ipynb.bin	10 MB
		4. Quantization.ipynb.bin	81.9 KB
		4. What is Quantization How does it affect model size and parameters.en_US.srt	4.9 KB
		4. What is Quantization How does it affect model size and parameters.mp4	40.2 MB
		5. Create a Hugging Face Account and Get a Token.en_US.srt	5 KB
		5. Create a Hugging Face Account and Get a Token.mp4	35.1 MB
		6. Create a Colab Notebook and Get Familiar with the Libraries.en_US.srt	4.7 KB
		6. Create a Colab Notebook and Get Familiar with the Libraries.mp4	14.7 MB
		7. Bolum_(Section)_1.ipynb.bin	465.1 KB
		7. Download the Model with Quantization.en_US.srt	6.8 KB
		7. Download the Model with Quantization.mp4	27.5 MB
		8. Bolum_(Section)_1.ipynb.bin	465 KB
		8. Differences Between Base and Instruct Models.en_US.srt	8.5 KB
		8. Differences Between Base and Instruct Models.mp4	78 MB
		9. Download and Examine the Dataset.en_US.srt	4.7 KB
		9. Download and Examine the Dataset.mp4	18.9 MB
		_.DS_Store	102.4 B
		_Bolum_(Section)_1.ipynb	716.8 B
		_Bolum_(Section)_3_DPO.ipynb	409.6 B
		_Bolum_(Section)_4_GRPO_.ipynb	512 B
		_Bolum_(Section)__2.ipynb	204.8 B
		_Quantization.ipynb	409.6 B
		_Thinking__(REASONING)_model.ipynb	204.8 B

Description

LLM Reinforcement Learning Fine-Tuning DeepSeek Method GRPO

https://WebToolTip.com

Last updated 6/2025
MP4 | Video: h264, 1920x1080 | Audio: AAC, 44.1 KHz, 2 Ch
Language: English | Duration: 3h 45m | Size: 1.85 GB

[EN] LLM Fine-Tuning and Reinforcement Learning with SFT, LoRA, DPO, and GRPO Custom Data HuggingFace

What you'll learn
You will grasp the core principles of Large Language Models (LLMs) and the overall structure behind their training processes.
You will learn the differences between base models and instruct models, as well as the methods for preparing data for each.
You’ll learn data preprocessing techniques along with essential tips, how to identify special tokens required by models, understanding data formats, and methods
You’ll gain practical, hands-on experience and detailed knowledge of how LoRA and Data Collator work.
You’ll gain a detailed understanding of crucial hyperparameters used in training, including their purpose and how they function.
You’ll practically learn, in detail, how trained LoRA matrices are merged with the base model, as well as key considerations and best practices to follow during
You’ll learn what Direct Preference Optimization (DPO) is, how it works, the expected data format, and the specific scenarios in which it’s used.
You’ll learn key considerations when preparing data for DPO, as well as understanding how the DPO data collator functions.
You’ll learn about the specific hyperparameters used in DPO training, their roles, and how they function.
You’ll learn how to upload your trained model to platforms like Hugging Face and manage hyperparameters effectively after training.
You’ll learn in detail how Group Relative Policy Optimization (GRPO), a reinforcement learning method, works, including an in-depth understanding of its learnin
You’ll learn how to prepare data specifically for Group Relative Policy Optimization (GRPO).
You’ll learn how to create reward functions—the most critical aspect of Group Relative Policy Optimization (GRPO)—through various practical reward function exam
In what format should data be provided to GRPO reward functions, and how can we process this data within the functions? You’ll learn these details thoroughly.
You’ll learn how to define rewards within functions and establish clear reward templates for GRPO.
You’ll practically learn numerous details, such as extracting reward-worthy parts from raw responses and defining rewards based on these extracted segments.
You’ll learn how to transform an Instruct model into one capable of generating “Chain of Thought” reasoning through GRPO (Group Relative Policy Optimization).

Requirements
Basic knowledge of Python programming.
Introductory-level familiarity with artificial intelligence and machine learning concepts.
Ideally, prior experience with Jupyter Notebook or Google Colab.

Related Torrents

torrent name	size	uploader	age	seed	leech
Udemy - AI Security - Defend LLM Apps Against the OWASP LLM Top 10 Posted by freecoursewb in Other	1.3 GB	freecoursewb	1 month	24	7
Udemy - LLM Token Optimization - Enterprise Cost and Performance Posted by freecoursewb in Other	1.3 GB	freecoursewb	1 month	0	0
Udemy - AI Engineering and LLM MasterClass Posted by freecoursewb in Other	1.6 GB	freecoursewb	1 month	42	2
Udemy - Agentic AI and LLM - Build AI Agents with ChatGPT, Ollama and RAG Posted by freecoursewb in Other	2.5 GB	freecoursewb	2 months	11	1
Udemy - LLM Prompt Injection CyberSecurity Testing Posted by freecoursewb in Other	316.5 MB	freecoursewb	3 months	0	0

Udemy - LLM Reinforcement Learning Fine-Tuning DeepSeek Method GRPO

Files

Description

Related Torrents

Trackers

All Comments

tracker name
udp://tracker.coppersurfer.tk:6969/announce
udp://9.rarbg.me:2850/announce
udp://9.rarbg.to:2920/announce
udp://tracker.opentrackr.org:1337
udp://tracker.leechers-paradise.org:6969/announce