“The DeepSeek team cracked cheap long context for LLMs: a ~3.5x cheaper prefill and ~10x cheaper decode at 128k context at inference with the same quality,” said Deedy Das, partner at Menlo Ventures, ...