> saying it is 30x faster than the Hopper generation when inferencing a 1.8 trillion parameter mixture-of-experts model.
Is that the size of gpt4? The upcoming systems seem designed to fit gpt4 on one system with 4bit quantisation, making gpt4 cheaper to deliver.
kristianp | a month ago
"DC" is very confusing here. "DC" normally means "Direct Current," not "Data Center."
I recommend updating the title. (And if anyone has connections back to The Register, suggest that they update the original article too.)
gwbas1c | a month ago
At what point do you use the waste heat to generate electricity?
120KW is not an insignificant amount of energy.