Model Reviews
Why Enterprise AI Agents Fail: Analyzing IBM and UC Berkeley's IT-Bench and MAST Research
IBM and UC Berkeley researchers have introduced IT-Bench and MAST to diagnose why autonomous agents struggle in enterprise environments, highlighting critical gaps in tool use and long-horizon planning.
Read more →