Yesterday's Top Poster

GLM 5.2 Fast via Wafer now available on AI Gateway

  • Thread starter Thread starter Rohan Taneja, Jerilyn Zheng
  • Start date Start date
GLM 5.2 Fast via Wafer is now available on AI Gateway.

Based on our own benchmarking across small-context, large-context, and tool-call scenarios, Wafer delivers a 2x higher throughput than other providers serving GLM-5.2 on serverless, leading on decode and end-to-end speed for sustained generation in the small- and large-context cases.

In our testing, GLM 5.2 Fast on Wafer measured:


  • Small context: 170+ tok/s


  • Large context: 200+ tok/s

To use GLM 5.2 Fast, set model to zai/glm-5.2-fast in the AI SDK:


AI Gateway provides a unified API for calling models, tracking usage and cost, and configuring retries, failover, and performance optimizations for higher-than-provider uptime. It includes built-in custom reporting, Zero Data Retention support, budgets for API keys, and more.

AI Gateway reflects provider pricing with no markup and does not charge a platform fee on inference, including on Bring Your Own Key (BYOK) requests.

Try GLM 5.2 Fast in the model playground.



Read more

Continue reading...
 
Back
Top