Alex Chen’s Blog - Research notes

Optimization for Inference of Large Language Model

Large Language Models

To run the language model faster and especially on the edge devices, we need to optimize the model. This…

Optimization in machine learning

Math Theories

In this paper, we will give a thorough of optimizer used in the machine learning. Specially, the pytorch…

Large language model distributed training

Large Language Models

The AWS sagemaker is a service to support the automatic training for the models. And the price is 1.5x of…

Complex analysis for machine learning

Math Theories

The real functional analysis is used a lot in the ML. There is also the case where the complex analysis is…

Large language model evaluation

Large Language Models

Today, the landscape of large language models (LLMs) is rich with diverse evaluation benchmarks. In this…

Mixture of expert

Large Language Models

MoE means the mixture of expert. In this blog, we will introduce the type of MoE used in the mixtral8x7B mode…

Scalable diffusion models with transformers

Diffusion Model

A text to image generation model from the diffusion architecture.

Reinforcement learning for large language model

Large Language Models

Reinforment is a common technique, which can be applied to the large language model area.