Wafer-Scale AI Chips Achieve 2,700 Tokens Per Second, 10x Faster Than Traditional GPU Systems
Revolutionary wafer-scale AI chips deliver breakthrough performance of 2,700 tokens per second—10 times faster than traditional GPU systems—by integrating hundreds of thousands of cores with massive on-chip memory, achieving sub-millisecond inference latency through new PLMR optimization model.