Illustrious XL 2.0 is new version of IL base model. Most of settings and prompting is similar to Illustrious XL 1.1 version. This guide will be update of Illustrious XL v1.0&v1.1 so if you didn't yet, you can start reading from here: Illustrious XL v1.0&v1.1 Guide
This guide will cover differences and improvements of new version alongside a few example prompt.
What Changed In Dataset With These Updates?
This version aims to stabilize 1536px resolution generations and improves the natural language support of model alongside Danbooru tags. Basically this version improves what IL1.1 version started. According to creator of model, v2.0 is trained for robust fine-tuning. This means v2.0 is best version for LoRA training and can create excellent results even with only natural language captioning.
Robustness: In the context of AI systems, refers to the ability of an algorithm or model to maintain its performance and stability under different conditions, including variations in input data, environmental changes, and attempts at adversarial interference.
Resolution Improvement
With IL1.0 version, OnomaAI decided to take a step further than other base models and bring 1536px base resolution to Illustrious XL base model. It was capable of generating high resolution images but model was still struggling with 1024x1536 resolution also 512x512 resolution generations were creating artifacts and deforms in images.
These issues were mostly related to training dataset. Since model didn't trained enough with this type various resolution settings, it had generation issues on these resolutions. They trained 2.0 with extensive dataset with large scales to fix these issues.
Now model is more stable with various resolutions from 512px to 2048px.
Recommended Resolution
It's still recommended to use 1536px based resolutions but you can use various resolution options depending on your choice. Of course it will effect to quality and accuracy of result but less than before.
Square: 1024x1024 - 1536x1536
Vertical: 1024x1536 - 1248x1824
Horizontal: 1536x1024 - 1824x1248
NOTE: Resolution is limited to 1536px for standart mode generation in SeaArt like all the other services. Users needs to use Quality Mode generation to be able use resolutions up to 2048px. In Quality Mode, each 3 pixel is 4 pixel. Basically, when you use 1024x1536 in Quality Mode, it's approximately 1365x2048. So if you want to use 1248x1824, enable Quality Mode and set resolution to 936x1368. It will give you 1248x1824 resolution in results.
Compatibility With Additional Tools And LoRAs
Illustrious XL base model has impressive compatibility with other SDXL based base models. Quality and accuracy of LoRA can be deducted depending on which base model is it but if you don't have alternative option you can still use SDXL, Pony, Animagine, IL0.1, IL1.0, IL1.1 based LoRAs in your generations. Best compatibility is with IL1.1 LoRAs as expected.
Additionally, model has compatibility with AI image generation tools. This includes img2img, ControlNet and more. With this improvement, model become totally a gem for ComfyUI users. Users can edit and control model as they wish with advanced workflow systems and skills.
Recommended Settings
Steps: 20-40
Illustrious XL model can be used with different step counts. This is limited to 40 in standart generation page and limited to 60 in ComfyUI. Although, 20-28 step is enough in most cases. If composition is too detailed and results looks unfinished, you can increase steps to 40 to get more sharp and detailed images.
CFG Scale: 3-7.5
Illustrious XL has smoother render with lower CFG while higher values increase contrast and saturation of outputs. Images can look over saturated with CFG values higher than 7.5 so it's recommended to stop at 7.5. If you use lower values than 3, results lose overall colors and look like it has fog filter. You can use CFG between 3-7.5 depending on your preference but 4.5-5 CFG is sweetpoint for general usage of Illustrious models.
Sampler: Euler a
Euler a is arguably the best sampler for Illustrious XL but model can work with many of samplers.
Clip Skip must be set on 2.
Prompting of Illustrious XL 2.0
Model has NLP (natural language prompt) support alongside Danbooru tags. Best way to prompt is using Danbooru tags. However you can support your tags with simple sentences or even use full natural language prompts with this version. It will effect prompt understanding capabilities of model but both syntax is supported and users can prefer one to use or mix them to create their prompt.
Importance of Negative Prompts
Illustrious XL models are very good at following negative prompts as much as positive prompts unlike the Pony v6 models. Negative prompts are important on these models, they can significantly improve results. It's recommended to use negative prompts actively just like positive prompts.
- Add multiple keywords and Danbooru tags to improve quality.
- If you want to remove something from composition, negative prompts are most effective way to do.
Prompt Style
- Use Danbooru tags for prompting to get best possible results. But using NLP is also an option. You can prefer depending on how you feel comfortable. There is no restriction about syntax.
- `masterpiece, best quality, amazing quality` keywords are very important for maintaining quality, please use them in front of your prompts. Additionally `very aesthetic, newest` tags can be used after quality keywords in prompts for better results.
- `absurdres, highres` keywords can be used at the end of prompt.
- Rating tags (`safe, sensitive, nsfw, explicit`) can be used before subject to avoid unwanted results.
- `worst quality, bad quality, very displeasing, displeasing, oldest` can be used in negative to get better results.
- `artistic error, artistic failure` keywords can be used in negative prompt to reduce generation errors to minimum.
- Standart tags from Danbooru such as `lowres, jpeg artifacts, censor, watermark, bad hands, bad anatomy, traditional media` in negative to improve quality of results. Additionally `bad angle, bad perspective, bad proportions` and other keywords with `bad` seems to be effective on improving quality. Especially in higher resolutions, difference is considerable and improvements are consistent.
- Use `focus` keyword with subject type for different type subjects. For example `animal focus` for animal render, `vehicle focus` for generating car, motorcycle type subjects.
- Artist tags can be used after quality tags or at the end of prompt.
- Users can add small descriptive words to tags to use both of options together. For example `leaning against wall` instead of `leaning back, against wall` or `glowing eyes under hair` instead of `glowing eyes, eyes visible through hair`. Using full tags still more stable and accurate but now users have option to use this type prompts. This can be helpful in certain scenerios where tags are not enough to explain what we want.
Recommended Prompt Order
- `masterpiece, best quality, amazing quality`
- `very aesthetic, newest` (Optional)
- Rating Tags (`safe, sensitive, nsfw, explicit`)
- Artist Tags (Only Danbooru)
- Subjects with count (`1girl/1boy/1other`)
- Subject Details
- Pose/Action Tags
- Composition Tags (Background, Environment, Objects)
- Additional Tags (Style, Theme, Expression)
- Additional Enhancer Tags (Focus, Lighting, Shadow)
- `absurdres, highres` (Optional)
Recommended Prompt Order For NLP
- `masterpiece, best quality, amazing quality`
- `very aesthetic, newest` (Optional)
- Rating Tags (`safe, sensitive, nsfw, explicit`)
- Artist Tags (Only Danbooru)
- Natural language prompt to describe composition
- Supportive tags to improve accuracy
- `absurdres, highres` (Optional)
Prompt Examples
Kafka
Positive Prompt
`masterpiece, best quality, amazing quality, newest, very aesthetic, 1girl, kafka \(honkai: star rail\), cowboy shot, purple hair, butterfly ornament, purple eyes, round eyewear, black choker, black jacket, jacket on shoulders, high waist shorts, black shorts, collared shirt, white shirt, pantyhose under shorts, purple pantyhose, purple gloves, purple spider web print, glowing web, glowing purple string, arm under breasts, hand up, standing, dynamic pose, darkness, dim lighting, bokeh, *** art, absurdres, highres`
Negative Prompt
`lowres, bad quality, worst quality, bad anatomy, sketch, jpeg artifacts, ugly, poorly drawn, censor,blurry, watermark, artistic failure, artistic error, bad proportions, bad perspective, displeasing, very displeasing, oldest, child, childish, traditional media`
Bunny Girl
Positive Prompt
`masterpiece, best quality, amazing quality, newest, very aesthetic, 1girl, upper body, smirk, grey hair, blunt bangs, long hair, round eyewear, rabbit ears, fake animal ears, blue eyes, long eyelashes, sideboob, medium breasts, virgin killer sweater, black sweater, fishnet sleeves, finger to mouth, dynamic pose, lineart, sharp lines, diffused light, absurdres, highres`
Negative Prompt
`lowres, bad quality, worst quality, bad anatomy, sketch, jpeg artifacts, ugly, poorly drawn, censor,blurry, watermark, artistic failure, artistic error, bad proportions, bad perspective, displeasing, very displeasing, oldest, child, childish, traditional media`
NLP Test (Accuracy Comparison With Flux)
Positive Prompt
`masterpiece, best quality, amazing quality, newest, very aesthetic, a gorgeous woman walking on the street at night, her orange hair glowing with light comes from street lamp, she is wearing a long black coat over her blue dress and look like a fashion model with her beauty, environment is dark because of the night and she is all alone, weather is cold and strong wind make it harder to walk outside, her face is blushed from cold and her breath leaves visible fog around her mouth, absurdres, highres`
Negative Prompt
`lowres, bad quality, worst quality, bad anatomy, sketch, jpeg artifacts, ugly, poorly drawn, censor,blurry, watermark, artistic failure, artistic error, bad proportions, bad perspective, displeasing, very displeasing, oldest, child, childish`
- Quality tags and negative is removed on Flux test, since it doesn't require or support them.
Overall quality is not good but accuracy and prompt following capabilities are impresive for an tag based anime model. It was able to generate even smallest details like her blush and breath. This accuracy was consistent so it means Illustrious XL can be casual model that everyone can use in future just like Flux. Although, it still supports Danbooru tags.