Omost

Preview：

<br />

Introduce：

Omost is a project that aims to transform the encoding power of large language models (LLMS) into the power of image generation (more precisely, image composition). It provides pre-trained LLM models based on Llama3 and Phi3 variants that are capable of writing code to combine image visual content using Omost’s virtual Canvas proxy. Canvas can be implemented by a specific image generator to actually generate the image. The technologies behind the Omost project include Direct Preference Optimization (DPO) and OpenAI GPT4o’s multimodal capabilities.
Omost
Stakeholders:
Omost’s target audience is primarily developers, researchers, and artists interested in AI image generation technologies. It is suitable for those who wish to explore and implement creative image concepts, but may not have the appropriate technical background or resources to develop an image generation system from scratch.
Usage Scenario Examples:

Artists use Omost to generate unique works of art based on textual descriptions.
Game developers use Omost to quickly generate in-game concept art and environmental context.
Marketing teams use Omost to create engaging advertising images and social media posts.

The features of the tool:

Support for a variety of data mix training, including ground reality annotation of public data sets such as Olien-Images.
Three pre-trained LLM models based on Llama3 and Phi3 are available.
The Canvas agent is able to render specific implementations of the image generator to actually generate the image.
Conversational Editing is supported, allowing users to have conversations with models to generate images.
Users can use Omost through the HuggingFace space or deploy it themselves.
Detailed API documentation and sample code are provided for the convenience of developers and researchers.

Steps for Use:

Visit Omost’s GitHub page for project details.
Read the documentation to understand how to deploy and use the Omost model.
Select the appropriate pre-training model and configure it according to your needs.
Write code to generate images using the Canvas agent and provided apis.
Conversational Editing is done by interacting with the model to refine the image generation results.
Apply the generated images to a desired project or further creative work.

Tool’s Tabs: AI, image generation