discard-icon
vip-icon

hyperfusion_vpred finetune 3.3m images

Last Update:2024-12-16 21:26:31
T
Create
2.5K
Favorite
68
Download
0
Model Source:
Type:
Checkpoint
Base Model:
SD 1.5
Trigger Words:
License Scope:
Creative License Scope
Online Image Generation
Merge
Allow Downloads
Commercial License Scope
Sale or Commercial Use of Generated Images
Resale of Models or Their Sale After Merging
Model Parameters:
Review:
5

This version of hyperfusion was trained on 3.3 million images over 10 months, and is a v_prediction + zero_snr model based on SD1.5.

This version was trained on SD 1.5, so there is no NovelAI influence in this checkpoint.

More image classifiers trained, and existing classifiers improved (list of classified tags under Training Data section)

Training Notes:

  • ~3.3m images

  • LR 4e-6

  • TE_LR 1e-6, droped to 1e-7 (after epoch 10)

  • batch 8

  • GA 16

  • 2x3090s so 2x the base batch size. total v_batch = 256

  • total images seen: 190_000 * 256 = 48_000_000

  • AdamW-8bit (ADOPT for the last epoch as a test)

  • scheduler: linear

  • base model SD1.5

  • No custom VAE, usually use the original SD1.5 VAE

  • flip aug

  • clip skip 2

  • 525 token length (appending captions + tags made this necessary)

  • bucketing at 768 max 1024

    • bucket resolution steps 32 for more buckets

    • trained at 768 for the first 10 epochs, and 1024 for the last 6

  • tag drop chance 0.15

  • caption_dropout 0.1

  • tag shuffling

  • --min_snr_gamma 3

  • --ip_noise_gamma 0.02

  • --zero_terninal_snr

  • about 10 months training time

Custom training configs:

I have implemented a number of things into Kohys's training code that have been suggested to improve training, and kept the things that seemed to make improvements.

  • drop out 75% of tags 5% of the time to hopefully improve short tag length results

  • soft_min_snr instead of min_snr

  • --no_flip_when_cap_matches: Prevent flipping images when certain tags exists like "sequence, asymmetrical, before and after, text on*, written, speech bubble" etc... This should help with text, and characters with asymmetrical features.

  • --important_tags: move important tags to the beginning of the list, and sort them separately from the unimportant ones (suggested from NovelAI if I remember correctly).

  • --tag_implication_dropout: Dropout similar tags to prevent the model from requiring them both to be present when generating. Like "breasts, big breasts" breasts will be dropped out 30-50% of the time. I used the tag implications csv from e621 as a base and added tags as needed. Even with 10%-15% tag dropout, some tag pairs were still being associated too often, this definitely made a difference. I think there were about 5k tags in total on the dropout list.

  • 12% of the dataset is captioned with CogVLM, as well as cleaning up many of the captions with custom scripts that correct common problems.

  • Tags vs Captions: 70% of the time use tags, ~20% of the time use captions (if they exist), 10% of the time combine tags with captions in different orders.

If I remember more custom changes, ill add them later.

translate
More
4 comment
P
Patotero
One of the best apps
05-16
View Translation
Like
Reply
M
Mo Ma
very good
2024-12-10
View Translation
Like
Reply
68
2.3K
4