illustrious XL tutorial created with SeaArt AI

This article explains illustrious XL 1.0 and 1.1 released in February 2025.

日本語版の記事はこちら

Basic Overview of Illustrious

Illustrious is a model primarily specialized for anime style. Its characteristics can be summarized as "the anime version of Flux.1." While conventional models often showed inconsistencies in character details (especially hands and feet), Illustrious has overcome this problem. The birth of this model has made anime-style image generation more precise and diverse.

Overwhelming Scale of Learning

Compared to previously mainstream Animagine XL and its derivative models, Illustrious's training data volume is significantly larger, resulting in the ability to generate more diverse images.

Evolution in Character Depiction

The issues of hands, feet, and fingers that were frequently pointed out in conventional SDXL models have been greatly improved. The old era of "there are 6 fingers but that's fine" is over. Also, with the expanded learning scale, it now supports various poses. This eliminates the need for repeatedly gambling with seeds or preparing multiple pose-specific LoRAs.
Completion of Anime Specialization

While SDXL-based models generally excel at depicting landscapes and objects, they had the drawback of unstable character details. However, Illustrious has overcome this and achieved overwhelming quality in anime-style images.

Overview of illustrious XL 1.0 and 1.1

illustrious XL 1.0 and 1.1 were released by ONOMAAI. They are the official successors to illustrious XL v0.1 and have been adjusted to generate high-quality images.

While maintaining high prompt adherence, they can generate images up to 1536x1536 pixels.

They have been trained on images added to danbooru until June 2024 and are compatible with v0.1 LoRAs.

Main Advancements from v0.1

Native support for high-resolution generation (such as 1536x1536). While previous SDXL models often produced corrupted images at this resolution, illustrious XL 1.0 and 1.1 can generate them normally.

They support both natural language and danbooru tag prompt styles, allowing precise adjustment of the nuances in generated images.

As they are not biased toward specific styles or artists, they also excel as base models for additional training.