# Fine settings

Fine tuning settings may not have a huge effect on the output, and are mostly used for specific uses.

For most where default is `None` it is often fine to not modify it and keep it `None`

### `optimiser`

> Default: `Adam`

Gradient descend optimizer, effects minimal. For now only effects vqgan. All others defaults to Adam. Seems unimportant to output.

![optimiser](https://2803297428-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FVgiFOveF1soKpu90ln3C%2Fuploads%2Fgit-blob-cbc4905385eec2e567d70845cf81cddaa739770b%2Foptms.png?alt=media)

### `batches`

> Default: `1`

The number of batches specifies how many iterations of accumulation of gradients would be needed before the image is altered one step. Think of it the same as in SGD. It can also be increased in conjunction with reducing `num_cuts` to reduce VRAM needed for image generation.

### `num_cuts`

> Default: `None`

`num_cuts` specifies the number of "cutouts" that the algorithm feeds into CLIP. A "cutout" is like image augmentation, and allows CLIP to see the image in differing zoom levels, perspectives, and so on. This is much like image augmentation for NN training - the more images, the better the grasp of the model. @robinsloan has a good write up [here](https://www.robinsloan.com/notes/cutouts/)

### `size`

> Default: `None`

Size specifies the width \* height of the image.

Usage:

> Python

```python
size = [100,200]
```

> Command line

```bash
--size 100 200
```

### `clip_models`

> Default: `None`

`clip_models` specify what CLIP pretrained models are used to optimize image generation. These models can be one of \['RN50', 'RN101', 'RN50x4', 'RN50x16', 'ViT-B/32', 'ViT-B/16'], where RN means the resNet architecture with 50 layers, and ViT means visual transformers. If using only one (typically with `quality=better` or `quality=best` there will be 3-4), it can save on VRAM usage, but with potential sacrifice to final output quality.

Usage:

> Python

```python
clip_models='RN50,ViT-B/16'
```

> Command line

```bash
--clip_models 'RN50,ViT-B/16'
```

![clip\_models](https://2803297428-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FVgiFOveF1soKpu90ln3C%2Fuploads%2Fgit-blob-2d44f189b26bc0a9d6ab16c6e0f4198cdc8a9212%2Fclip%20models.png?alt=media)

### `noise_prompt_seeds` and `noise_prompt_weights`

> Default: `None` and `None`

When both is not none, for every batch in an iteration, there will be a small random noise placed in the loss (and gradient). This is done by creating a prompt that only generates noise for every output. This may be used to allow the model to make some risk or change, or to be able to escape local minima. This is all just speculation, however.

Note that if one of them is shorter, it will silently reduce the total length of the shorter one; If one is `None`, noise prompts would not be active at all.

The effects are unclear at the moment.

Usage:

> Python

```python
noise_prompt_seeds=[1,2]
noise_prompt_weights=[0.5,0.5]
```

> Command line

```bash
--noise_prompt_seeds 1 2 3 --noise_prompt_weights 1 1 1
```

![noise prompts](https://2803297428-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FVgiFOveF1soKpu90ln3C%2Fuploads%2Fgit-blob-49aac7b70057c75fd2f625e7b2b78fa36ba8efc7%2FCanvas%201.png?alt=media)

### `init_noise`

> Default: `pixels`

`init_noise` can be one of `pixels` , `gradient` , `snow` , `none`. Note that `none` cannot be used without a `init_image`, see more in [Image control settings](https://dazhizhong.gitbook.io/pixray-docs/docs/image-control-settings)

![init\_noise](https://2803297428-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FVgiFOveF1soKpu90ln3C%2Fuploads%2Fgit-blob-9b3e70f2d86587375779f6dfe3f87dd6f03eb348%2Finit_noise.png?alt=media)

### `learning_rate`, `learning_rate_drops`, and `auto_stop`

> Default: `0.2`, `[75]`, and `False`

The learning rate determines how big a step the model should take, but may cause unstable/more jittery images. `learning_rate_drops` is a list of percentage amounts that learning rate should drop, a drop means a division by 10. For example, the default is `[75]`, meaning that at 75% iterations, the lr would divide by 10.
