The Challenge Most GUI Agent demos show the same thing: open a browser, fill out a form, click "Submit." It works, it's useful, but it doesn't really stress-test what these agents can do. We wanted to find out: what happens when you throw a GUI Agent into a completely unfamiliar, non-standard interface? So we picked Mahjong — a Chinese tile game with complex rules, dense visual information, and