# Clients

Mancer AI is not (at this time) any kind of front-end. Not even on the website.

We just provide an interference-free LLM completion API, which you can use anywhere that supports it.

In this section are all the clients/front-ends that have integrated support for Mancer, one way or another.

To someone interested in adding support for us, you'll get 95% of the way there just by adding text-generation-webui's API with an X-API-KEY header.

# Sampling Parameters

We don't support the entire WebUI interface (yet). The options we do support are:

Response Length
- Upper limit on how many tokens you want the LLM to generate for you
- Recommended: 100 - 200, usually below 500.
Context Size
- Upper limit on how many tokens you want the LLM to consider when generating your response
- Each model has their own upper limit, the completion will fail if you exceed it. Usually this is handled for you by the client.
- Recommended: 2048, 4096, or 8192, depending on the model's capacity.
Temperature
- Larger values make replies more "interesting", choosing less-likely outcomes.
- Smaller values result in less variation between re-rolls, but more likely to be "correct" answers.
- Higher values can be wrong more often, but will showcase much more variety.
- Recommended: 0.7 - 1.2
Repetition Penalty
- Discourage the LLM from repeating tokens/words it's already seen.
- Too low, and you can fall into extremely repetitive replies.
- Too high, and you can get long run-on chains of word soup.
- Recommended: 1.05 - 1.2
Top K
- Sampler criteria. 0 disables. Very simple: Only consider the K most-likely tokens.
Top P
- Sampler criteria. 1 disables. Intuitive understanding is that you give it a probability 'budget', so tokens with overwhelmingly high likelihoods will get chosen, if present.
- Example: If p=0.9, and you have a 0.95-weight token, that will be the only option.
- Example: If p=0.9, and you have 0.50-weight and 0.45-weight tokens, those two will be the only options.
- Recommended: 0.75-0.95
Ban EOS Token
- If true, the LLM is not allowed to 'stop' generation on its own, and will instead keep producing tokens until it reaches 'Response Length'.
- No recommendation, but if you're getting unexpected chunks of python code or comment sections in your replies, try setting it to 'false'.

No other options do anything. Any effect you think they're having is entirely placebo! This list will be updated as support for additional options are added.