DeepSeek-R1

DeepSeek-R1

DeepSeek-R1 is a reasoning model trained via large-scale reinforcement learning (RL) without the need for supervised fine-tuning (SFT). It demonstrates remarkable performance in reasoning tasks, including self-verification and reflection. The model addresses challenges such as endless repetition and poor readability, and achieves performance comparable to OpenAI-o1 across math, code, and reasoning tasks.

關鍵詞

DeepSeek-R1Reasoning ModelOpen Source Large Language ModelOpen Source LLMOpen SourceReinforcement LearningSupervised Fine-TuningLanguage ModelCode GenerationMathematical ReasoningMachine LearningNatural Language Processing

分享