Whether you want to build a document scanner, digitize receipts, or add text recognition to your mobile app, this project is a perfect starting point. This project is provided for educational and ...
Abstract: In recent years, the zero-shot image recognition with semantic knowledge has achieved good performance due to vision-language models. However, because of the complexity of 3D shapes, the ...
Abstract: Generating images that align with textual input using text-to-image (TTI) generation models is a challenging task. Generative adversarial network (GAN) based TTI models can produce realistic ...
Important Note: This repository implements SVG-T2I, a text-to-image diffusion framework that performs visual generation directly in Visual Foundation Model (VFM) representation space, rather than ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results