Wan 2.1 is a unified multimodal model that not only supports video generation, but also supports text-to-image functionality