Alex Chen’s Blog
Home
Articles
News
Research notes
Personal summaries and insights gathered from reading various research papers and articles.
Optimization for Inference of Large Language Model
Large Language Models
To run the language model faster and especially on the edge devices, we need to optimize the model. This…
5 min
Optimization in machine learning
Math Theories
In this paper, we will give a thorough of optimizer used in the machine learning. Specially, the pytorch…
13 min
Large language model distributed training
Large Language Models
The AWS sagemaker is a service to support the automatic training for the models. And the price is 1.5x of…
5 min
Complex analysis for machine learning
Math Theories
The real functional analysis is used a lot in the ML. There is also the case where the complex analysis is…
5 min
Large language model evaluation
Large Language Models
Today, the landscape of large language models (LLMs) is rich with diverse evaluation benchmarks. In this…
11 min
Mixture of expert
Large Language Models
MoE means the mixture of expert. In this blog, we will introduce the type of MoE used in the
mixtral8x7B
mode…
5 min
Scalable diffusion models with transformers
Diffusion Model
A text to image generation model from the diffusion architecture.
1 min
Reinforcement learning for large language model
Large Language Models
Reinforment is a common technique, which can be applied to the large language model area.
19 min
No matching items