HackerNews中文版

DeepSeek实际上是如何做到这一点的？他们只是将Claude的回答作为自己的模型训练数据，以提升推理能力吗？如何确切地在一个模型上训练其他模型的输出？这其中涉及哪些工程方法？我希望能看到大规模执行这一过程的详细拆解。背景：Anthropic最近指责DeepSeek、Minimax、Moonshot使用大量虚假账户与Claude进行交流，利用其输出来训练模型，并称之为"蒸馏攻击"。

查看原文

How is Deepseek actually doing this? Are they just feeding claude's answers into their own models as their own model as training data to improve reasoning? How exactly one train it's model on output of other? what's enginnering inovlved here?I'd love breakdown of how thsi is executed at scale.Backstory:Anthropic recently accused Deepseek,Minimax,Moonshot of using lots of fake accounts to generate exchanges with claude, using the outputs to train the model and called it "distillation attack".

问 HN：如何在另一个 AI 上训练 AI？