Skip to content

User Interface Guide

This page explains how to configure and use Assistant Engine’s UI—no developer background required.

App overview


Configure Your Multi-Model Assistant

In Settings → Models (left sidebar), Assistant Engine lets you run several specialized models together. Click each role to adjust its settings, test, and (optionally) save.

Sidebar – Models

Model roles at a glance

Role What it does Typical size When to choose bigger Notes
Assistant Primary brain: answers your questions and decides which tools to use. Large Complex reasoning, longer answers, mixed tasks. Usually your biggest model.
Embedding Converts documents into vectors for fast semantic search. Small–Medium Very large libraries or multilingual content. A “vectorizer” model; used during ingestion & retrieval.
Descriptor Summarizes your files/tables before embedding (“explain then store”). Small–Medium Highly technical data needing richer summaries. Think “auto-comment your data for better search.”
Correction Offers alternative phrasings/ideas when the Assistant gets stuck. Small Rarely needed for simple tasks. Optional; diversity booster.
Text-to-SQL Turns plain English into SQL for supported databases. Medium Complex schemas, tricky joins. Pairs well with Describe Database (below).
Mini Task Lightweight tasks (e.g., auto-naming chat titles). Tiny Almost never. Background helper only.

Try before you save

Tune a role, run a quick test, then save if you like the results. You can always revert.


Model settings (what the sliders mean)

You’ll see these controls on each role:

  • Model – Which local model file/variant to use.
  • Temperature – Creativity knob. Lower = precise & consistent. Higher = varied & imaginative.
  • Top-p – Keeps only the most likely words whose combined probability is p. Lower = safer, higher = freer wording.
  • Top-k – Limits choices to the top k next-word options. Lower = more conservative; higher = more exploratory.
  • Max output tokens – Caps the length of the answer. If replies get cut off, raise this.
  • Presence penalty – Encourages new topics vs. repeating earlier words/ideas.
  • Frequency penalty – Reduces repeated words/phrases within the same reply.

Practical presets

  • General chat → Temperature 0.6–0.8, Top-p 0.9, Top-k 40–100
  • Focused / code answers → Temperature 0.2–0.4, Top-p 0.8–0.9, Top-k 20–50
  • Long articles → Increase Max output tokens (and consider a larger Assistant model)

Context (How the agent behaves)

In Model Options → Context (left sidebar), you define persistent instructions that shape the agent’s tone and priorities.

Sidebar – Context

  • System instructions: What the agent should always do (e.g., “be concise,” “use steps”).
  • Examples: Short examples of the style you like (Q\&A snippets).
  • Tooling preferences: Tell the agent when to search, use SQL, or stick to local files.
  • Safety/limits: Define topics to avoid or when to ask for confirmation.

Good context examples

  • Prefer accurate, short answers; use bullet points.
  • If a database is connected, check it before guessing.
  • Cite the source (file or DB table) when answering from my data.

File Access

In Model Options → RAG / Data Ingestion, click Add Ingestion Path to give your Assistant Model access to files and folders.
Assistant Engine will chunk and vectorize these files, making their content searchable and usable during conversations.

Sidebar – File Access

Options explained

  • Path – The folder or file path to ingest (e.g., C:\Projects\Docs or /Users/alex/notes).
  • File Extensions – Limit ingestion to certain file types (e.g., .cs, .md, .pdf).
    Useful when you only want specific content.
  • Explore Subfolders – Toggle whether Assistant Engine should look inside subdirectories automatically.
    Disable if you only want the top-level folder.

Best practice

Start small. Add a single folder first, test retrieval, then expand with more sources once you’re confident.


Database Access

In Model Options → Databases, click Add Database to give your Assistant Model structured access to your data.

Sidebar – Database Access

Options explained

  • Name – A friendly label (e.g., “Sales Warehouse”).
  • Connection String – Defines how Assistant Engine connects (server, database, credentials).
    Your database admin can supply this.
  • Dialect – The database type (e.g., SQL Server, PostgreSQL, MySQL).
  • Describe Database – If enabled, Assistant Engine uses the Descriptor Model to create summaries of your schema before ingestion.
    This improves search and SQL generation accuracy.

Database Considerations

  • Permissions – Ensure the user account in your connection string has read access.
  • Schema Size – Large schemas may take longer to ingest.
  • Updates – Re-ingest whenever your schema changes.
  • Security – Store credentials securely; avoid sharing them in plaintext.

Best practice

  1. Add Database to confirm credentials and connectivity.
  2. Run Ingest to build the vector store.
  3. Re-ingest if your schema changes.

Adding New Models (via Ollama)

In Settings → Models, you can install new models directly from the UI.

Model Download Modal

  1. Open the model dropdown (to the right of Add Chat).
  2. Select Download new model.
  3. Pick a model from the list and start the download.
  4. Wait until the status changes to Ready.
  5. (Optional) Assign the model to one of your roles in the Chat Options sidebar, then click Save.

Local & Private

All models are pulled from your own Ollama server. Keep in mind: larger models take up more disk space and require more RAM/VRAM.


Global Settings

At the very bottom of the left sidebar, click Advanced Settings to open the global configuration panel.

Global Settings Modal

Here you can adjust application-wide options:

  • Model files folder – Controls where Assistant Engine saves and loads multi-model configuration files.
  • Vector store database – Defines where embeddings are stored.
    • Currently this is shared across all configurations.
    • Future versions will allow separate vector stores per configuration.
  • Import / Export JSON – Back up or restore your global settings in one step.

Restart Required

Changes in Advanced Settings only take effect after restarting Assistant Engine.


Choosing Appropriate Models

Pick lighter models if you’re on modest hardware or want faster replies — e.g. a Medium Assistant with Small Embedding/Descriptor models. On stronger machines or for complex work, step up to a Large Assistant, with Medium–Large Text-to-SQL and higher-quality Embedding/Descriptor models. If your workflow leans heavily on search and retrieval, invest in better Descriptor and Embedding quality; if it’s mainly database Q&A, enable Describe Database and choose a stronger Text-to-SQL model.

Quantized vs full-precision

Quantized models run faster with less memory, at some accuracy cost. Start quantized and only move up if you need deeper reasoning.

Having issues ingesting or accessing data?

Open Settings → Data and use Delete Vector Stores, then re-ingest. This fixes most schema/permission changes and stale index issues.