问 HN:如何在另一个 AI 上训练 AI?
DeepSeek通过使用Claude的输出作为训练数据来提升自身模型的推理能力。这一过程涉及将其他模型的响应作为监督信号,用于优化自身的参数。具体工程方法包括数据收集、预处理及模型微调等步骤。此训练方式引发了关于模型间数据共享与潜在违规行为的争议。
2 分•作者: timonpimba•大约 2 小时前
DeepSeek实际上是如何做到这一点的?他们只是将Claude的回答作为自己的模型训练数据,以提升推理能力吗?如何确切地在一个模型上训练其他模型的输出?这其中涉及哪些工程方法?<p>我希望能看到大规模执行这一过程的详细拆解。<p>背景:<p>Anthropic最近指责DeepSeek、Minimax、Moonshot使用大量虚假账户与Claude进行交流,利用其输出来训练模型,并称之为"蒸馏攻击"。
查看原文
How is Deepseek actually doing this? Are they just feeding claude's answers into their own models as their own model as training data to improve reasoning?
How exactly one train it's model on output of other? what's enginnering inovlved here?<p>I'd love breakdown of how thsi is executed at scale.<p>Backstory:<p>Anthropic recently accused Deepseek,Minimax,Moonshot of using lots of fake accounts to generate exchanges with claude, using the outputs to train the model and called it "distillation attack".