Goku is a joint image-and-video generative model based on rectified flow Transformers, designed for high-quality visual generation tasks including text-to-video, image-to-video, and text-to-image generation.