IntakeBot — Conversational Form Intake: Requirements and Design
Conversational form intake engine. No dynamic forms are built or shown to the user. The bot collects information purely through conversation (chat or voice). Forms are defined in JSON and stored in a form repository; the Agent runs in intake mode when started with
--FORMS. Data is written to an intake folder, timestamped and prefixed by form name, with partial data persisted as the LLM collects it.
1. Overview
1.1 Purpose
IntakeBot is a conversational form intake engine. The bot:
- Talks the user through a form: introduces the form, asks for each field in order, and collects answers via conversation only (chat or voice).
- Does not render or display any form UI — no on-screen fields, cards, or dynamic forms. The user never sees a “form”; they only converse with the bot.
- When multiple forms exist, helps the user choose which form to fill using intent instructions and explicit options.
- Persists data into an intake folder: one timestamped file per intake session, updated once per user turn (as fields are collected) so partial data is always on disk.
1.2 Design Principles
- Conversation-only — No forms are built or shown. All collection happens via dialogue.
- Form-depo + intake mode — Form definitions live in
Agent/form-depo/. The Agent is started with--FORMSto run in intake mode (form intake only, no domain RAG/Q&A). - Form-agnostic — Any form that fits the JSON schema works. The schema includes intent/selection instructions so the LLM can decide which form the user wants when there are many.
- Partial data always — Partial data is written once per user turn (after all
record_fieldand anysubmit_formfor that turn). We always have up-to-date partial data on disk, not only on completion.
1.3 Out of Scope (First Version)
- Dynamic form UI, cards, or progress bars. Purely conversational.
- Anonymous intake without a defined identity (e.g. session id) for the intake file.
- Conditional fields, pattern validation, min/max. Deferred.
2. Form Repository (form-depo) and Intake Mode
2.1 Form-depo Folder
- Location: When
FORM_DEPO_PATHis unset or empty, use<project_root>/Agent/form-depo(domain-chatbot project root). Set theFORM_DEPO_PATHenv var to override. - Content: One JSON file per form definition. File names can match form
id(e.g.contact_form.json) or be arbitrary; the form’sidinside the JSON is the canonical identifier. - Loading: The API (
service_formintake) loads forms from the resolved path: it scans*.json, validates each as a form schema. These are the only forms available for intake.
2.2 Intake Mode and --FORMS
- Trigger: The Agent’s main script is run with
--FORMS(e.g.python run_agent.py --FORMSorpython -m Agent.main --FORMS). Themain()inAgent/main.pyreads this flag. - Behaviour when
--FORMSis set:- The Agent operates in intake mode.
- Forms are loaded by the API from the form-depo path: when
FORM_DEPO_PATHis unset,<project_root>/Agent/form-depo; setFORM_DEPO_PATHto override. - No domain RAG, no file_search, no domain Q&A. The bot’s only job is form intake: form selection (if multiple) and then field collection for the chosen form.
- When
--FORMSis not set: The Agent runs in the existing, normal mode (domain RAG, Pipecat, etc.). Intake and form-depo are not used.
2.3 Single Form vs Multiple Forms
- Single form in form-depo: Bot can go straight to the form’s introduction and first field (optionally still mention which form it is).
- Multiple forms in form-depo: Bot must first disambiguate:
- Use each form’s intent/selection instructions (and optionally keywords) so the LLM can interpret what the user wants.
- If the user’s intent is unclear or several forms could match, the bot presents options (e.g. using
titleorshort_labelper form) and asks the user to choose. Once chosen, load that form and start intake.
3. Form Schema (JSON)
3.1 Goals
- Form identity, title, introduction, and fields.
- Intent/selection instructions for the LLM to decide when this form is the one the user wants (and for disambiguation when there are multiple forms).
- Multi-language for all user-facing strings.
- Field types, required/optional, options for
select.
3.2 Proposed Structure
{
"id": "contact_form",
"title": { "en": "Contact Form", "fa": "فرم تماس" },
"short_label": { "en": "Contact", "fa": "تماس" },
"intent_instructions": {
"en": "Choose this form when the user wants to: send a message, get in touch, leave contact info, submit feedback, or ask a question.",
"fa": "این فرم وقتی انتخاب شود که کاربر بخواهد: پیام بفرستد، تماس بگیرد، اطلاعات تماس بگذارد، بازخورد بدهد یا سؤال بپرسد."
},
"introduction": {
"en": "I'll help you with that. I'll ask a few questions.",
"fa": "کمک میکنم. چند سؤال میپرسم."
},
"fields": [
{
"id": "full_name",
"type": "text",
"required": true,
"label": { "en": "Full name", "fa": "نام کامل" },
"placeholder": { "en": "e.g. John Doe", "fa": "مثال: علی محمدی" }
},
{
"id": "email",
"type": "email",
"required": true,
"label": { "en": "Email", "fa": "ایمیل" }
},
{
"id": "message",
"type": "textarea",
"required": false,
"label": { "en": "Message", "fa": "پیام" }
}
],
"submit_label": { "en": "Submit", "fa": "ارسال" },
"metadata": {
"version": "1.0",
"supported_languages": ["en", "fa"]
}
}
3.3 Intent and Form Selection
intent_instructions(object, one key per language): Instructions for the LLM to decide if the user’s message matches this form. Used when:- There is a single form (to confirm it’s the right one), or
- There are multiple forms (to shortlist and, if needed, to present options).
short_label(object, per language): Short name for the “Which form do you want?” list. If missing, fallback totitleorid.- When multiple forms match or intent is ambiguous: the bot lists forms using
short_label(ortitle) and asks the user to pick. The LLM then continues with the chosen form’s schema.
3.4 Field Types
| Type | Description | Validation (v1) | Example in stored data |
|---|---|---|---|
text | Short text | Non-empty if required | "John" |
email | Basic *@*.* if required | "a@b.com" | |
phone | Phone | Optional | "+1234567890" |
number | Numeric | Optional | 42 |
date | Date | Optional (e.g. ISO) | "2025-01-25" |
select | Single choice | Value in options | "opt_a" |
textarea | Longer text | Non-empty if required | "Hello..." |
boolean | Yes/no | Normalize to bool | true |
select example:
{
"id": "country",
"type": "select",
"required": true,
"label": { "en": "Country", "fa": "کشور" },
"options": [
{ "value": "ca", "label": { "en": "Canada", "fa": "کانادا" } },
{ "value": "us", "label": { "en": "United States", "fa": "ایالات متحده" } }
]
}
3.5 Field Order and Required vs Optional
- Order:
fieldsarray order = asking order. Bot asks the next required empty field first; optional fields follow or can be skipped. - Required: If
required: true, the form is not considered complete until that field is inform_statewith a non-null value. - Optional: If
required: false, user may skip; we storeform_state[field_id] = null(explicit skip). Downstream can distinguish "skipped" vs "never asked".
3.6 Localization
- For
title,short_label,intent_instructions,introduction,label,placeholder,submit_label,options[].label: use the user’s language with fallback to"en".
4. Intake Storage (intake folder, partial data)
4.1 Intake Folder
- Location: When
INTAKE_PATHis unset or empty, use<project_root>/Agent/intake. Set theINTAKE_PATHenv var to override. - Role: All intake data is written here. No use of the
/submissionsAPI in intake mode; the intake folder is the persistence layer.
4.2 File Naming and Lifecycle
- Filename:
{form_name}_{session_start_timestamp}.jsonform_name= form’sid(e.g.contact_form).session_start_timestamp= set as soon as a form is selected (before the firstrecord_field). Format:YYYYMMDD_HHmmss(e.g.20250125_143022). Stable for the whole session.
- One file per intake session. The file is created on form selection with an initial write (
form_state: {}). The same file is then overwritten once per user turn (after allrecord_fieldand anysubmit_formfor that turn) so that partial data is always on disk.
4.3 File Contents (Partial and Final)
Each write includes at least:
form_id: form’sidstarted_at: ISO or same timestamp as in the filenameupdated_at: time of this writeform_state:{ "field_id": value }— partial while collecting, complete when submitted. Values can benullfor skipped optional fields.completed_at: set only when the form is submitted; omit otherwise.
Example (initial, on form selection):
{
"form_id": "contact_form",
"started_at": "2025-01-25T14:30:22",
"updated_at": "2025-01-25T14:30:22",
"form_state": {}
}
Example (partial, after two fields):
{
"form_id": "contact_form",
"started_at": "2025-01-25T14:30:22",
"updated_at": "2025-01-25T14:30:45",
"form_state": {
"full_name": "John Doe",
"email": "j@example.com"
}
}
Example (final, after submit_form):
{
"form_id": "contact_form",
"started_at": "2025-01-25T14:30:22",
"updated_at": "2025-01-25T14:31:10",
"completed_at": "2025-01-25T14:31:10",
"form_state": {
"full_name": "John Doe",
"email": "j@example.com",
"message": "Hello, I have a question."
}
}
4.4 When We Write
- On form selection: As soon as the user has chosen a form (or the only form is chosen), create the intake file, set
session_start_timestamp(formatYYYYMMDD_HHmmss), and write the initial object:form_id,started_at,updated_at,form_state: {}, nocompleted_at. The response for that turn must includesession_start_timestampso the client can send it on/query/intake/continue. - On
record_field: The handler only validates and updatesform_statein memory; it does not write. The orchestrator inservice_formintakedoes one write at end of the user turn (after allrecord_fieldand anysubmit_formfor that turn). The file already exists from the form-selection write. Ifsubmit_formruns in that turn, its handler performs the final write (withcompleted_at) and no separate end-of-turn write is needed. - On
submit_form: Same file; addcompleted_atand do a final write with the completeform_state.
5. User Stories and Requirements
5.1 Happy Path (Single Form)
| # | Actor | Action | Expected outcome |
|---|---|---|---|
| U1 | User | Starts a conversation (chat or voice) and wants to fill the only form in form-depo | Bot introduces the form with introduction[language] and asks for the first required field. |
| U2 | User | Answers a field (e.g. "John Doe" for full_name) | Bot records it, writes partial data to {form_id}_{timestamp}.json, confirms briefly, asks for the next field. |
| U3 | User | Provides all required (and any optional) fields | Bot calls submit_form, writes the final state with completed_at to the same file, and confirms. |
5.2 Form Selection (Multiple Forms)
| # | Actor | Action | Expected outcome |
|---|---|---|---|
| U4 | User | Says something ambiguous (e.g. "I need to send something") and form-depo has several forms | Bot uses intent_instructions to shortlist; if still ambiguous, presents options using short_label (or title) and asks the user to choose. |
| U5 | User | Picks a form (e.g. "Contact" or "Option 1") | Bot loads that form’s schema, creates the intake file {form_id}_{session_start_timestamp}.json with form_state: {}, includes session_start_timestamp in the response, says the form’s introduction, and asks for the first required field. |
| U6 | User | Says something that clearly matches one form’s intent_instructions | Bot chooses that form, creates the intake file, includes session_start_timestamp in the response, gives the introduction, and asks for the first field. |
5.3 Multi-Turn and Continue
| # | Scenario | Requirement |
|---|---|---|
| U7 | User pauses mid-form | Client stores form_id, form_state, and session_start_timestamp; on resume, sends all three so the bot continues from the next empty required field. The intake file for that session remains; we keep overwriting with partial writes when more fields are recorded. |
| U8 | User corrects a previous answer | Bot overwrites via record_field with the same field_id and new value; on the next write, the intake file reflects the correction. |
| U9 | User asks "what did I say for email?" | Bot reads from form_state and answers. |
5.4 Multi-Lingual and Voice
| # | Scenario | Requirement |
|---|---|---|
| U10 | User speaks or types in Farsi | Bot responds in Farsi; uses label["fa"], introduction["fa"], short_label["fa"], intent_instructions["fa"]. |
| U11 | User uses voice (Pipecat/Agent) | In intake mode, the same Agent/Pipecat stack is used; STT/TTS and language selection stay as today. No form UI—conversation only. |
5.5 Validation, Skip, and Errors
| # | Scenario | Requirement |
|---|---|---|
| U12 | Invalid value (e.g. email) | record_field validates; on failure, returns an error and re-asks; no write to the intake file for that value. |
| U13 | User says "skip" for optional field | Bot calls record_field(field_id, null); handler sets form_state[field_id] = null. Next write includes that key with null. |
| U14 | User says "skip" for required field | Bot explains it is required and re-asks. |
| U15 | submit_form but required fields missing | Tool returns { ok: false, missing: [...] }; LLM asks for those; no completed_at write. |
5.6 Abandonment and Edge Cases
| # | Scenario | Requirement |
|---|---|---|
| U16 | User says "cancel" or "never mind" | Bot acknowledges (natural language; no tool). Client clears form_id, form_state, and session_start_timestamp. The intake file stays as last partial write (no completed_at). |
| U17 | User sends chitchat mid-form | Bot answers briefly and returns to the pending field. |
| U18 | User gives multiple fields in one message | LLM may call record_field multiple times; each valid call updates form_state; one write at end of the turn (orchestrator). |
6. Bot Behaviour (System Prompt / Instructions)
6.1 Role and Constraints
- Role: Conversational form intake assistant. Use only the form schema(s) from form-depo. Do not use file_search, RAG, or domain knowledge. In intake mode the only tools are
record_field,submit_form, and (if we add it)select_formor equivalent for form selection. - No form UI: You never describe or render a form. You only ask questions in natural language and record answers.
6.2 Form and State
- Form:
id,title,short_label,intent_instructions,introduction,fields(order = asking order).form_state={ field_id: value }. Instructions must let the model derive: missing required fields, next field to ask. Do not inject progress (e.g. "field 3 of 8"); the model may infer it fromform_stateandfieldsif it wishes.
6.3 Form Selection (Multiple Forms)
- Use each form’s
intent_instructions[language]to decide if the user’s message matches that form. - If exactly one matches: select it, create the intake file, say
introduction, ask the first required field. - If several match or unclear: list forms by
short_label(ortitle), ask the user to choose. On choice, load that form and proceed as above. - If none match: ask what the user wants to do and, if possible, map to a form or say we don’t have a matching form.
6.4 First Turn for a Chosen Form (Empty form_state)
- Say the form’s
introductionin the user’s language. - Ask for the first required field using
label[language](andplaceholderordescriptionif present).
6.5 When the User Provides a Value
- Unambiguous: Call
record_fieldwithfield_idand extractedvalue. The handler updatesform_stateonly; the orchestrator does one write at end of the turn to the intake file. Then ask the next required empty field (or optional, thensubmit_formwhen done). - Ambiguous: Ask a short clarification before calling
record_field. - Multiple fields in one message: Call
record_fieldonce per field; multiplerecord_fieldper message are allowed; the orchestrator does one write at end of the turn.
6.6 When All Required (and Desired Optional) Are Filled
- Call
submit_form. The handler writes the final state withcompleted_atto the intake file. Model can then give a short confirmation.
6.7 Correction, Skip, Cancel, and Chitchat
- Correction:
record_fieldwith samefield_idand new value (overwrite). Next write to the intake file has the correctedform_state. - Skip (optional): Call
record_field(field_id, null); handler setsform_state[field_id] = null. Move on to next field. - Skip (required): Explain and re-ask.
- Cancel: User says "cancel" or "never mind". Acknowledge in natural language; no
cancel_formtool. Client clearsform_id,form_state, andsession_start_timestamp. Intake file remains as last partial write (nocompleted_at). - Chitchat: Answer briefly and return to the pending field.
7. Tools: record_field and submit_form
7.1 record_field
- Purpose: Record one field: validate and update
form_state. Used for first-time recording, corrections (samefield_id, new value → overwrite), and optional skip (value: null→form_state[field_id] = null). Does not write to disk; the orchestrator inservice_formintakedoes one write at end of the user turn (after all tool calls for that turn). - When: When the user has clearly given a value for a known field (first time or correction), or explicitly skips an optional field (
value: null). - Parameters:
field_id,value(valuemay benullonly for optional fields—skip). - Handler:
- If
valueisnull: only allowed for optional fields. If the field is required: return{ ok: false, error: "validation_failed", message: "Cannot skip required field" }. If optional:form_state[field_id] = null; return{ ok: true, recorded: field_id }. - If
valueis not null: validatefield_idinform_schema.fieldsandvalue(e.g. email format). If invalid: return{ ok: false, error: "validation_failed", message: "..." }; do not updateform_stateor write. - If valid:
form_state[field_id] = value. Return{ ok: true, recorded: field_id }. Do not write to the intake file; the service/orchestrator performs one write per turn after allrecord_field(and anysubmit_form) for that turn.
- If
7.2 submit_form
- Purpose: Mark the form complete and write the final state to the intake file.
- When: Only when
form_statehas every required field. - Parameters: None.
- Handler:
- Check all required fields are present and non-empty. If not: return
{ ok: false, error: "missing_fields", missing: [...] }; do not writecompleted_at. - If valid: set
completed_at; write to the same{form_id}_{session_start_timestamp}.jsonthe object withform_id,started_at,updated_at,completed_at,form_state. Return{ ok: true, form_submitted: true, message: "<localized success>" }.
- Check all required fields are present and non-empty. If not: return
7.3 Where These Tools Live
- In
LLM_full/query/service_formintake.pyonly. This module is the main intake service: it implements form loading, instructions,record_field,submit_form, and intake-folder writes. Not in the global domain-tools registry.service_vector.pyis unchanged and is not used for intake.
8. Intake Mode: Agent and Startup
8.1 Entrypoint
- Script:
run_agent.py(orpython -m Agent.main). Both should support--FORMS. main()inAgent/main.py: On startup, check for--FORMS. If present:- Set intake mode (e.g.
INTAKE_MODE=Trueor equivalent). - Start the rest of the Agent (WebSocket, Pipecat, etc.) configured so that in intake mode only the intake pipeline runs (no domain /query with RAG). The API (service_formintake) resolves form-depo and intake paths from settings: when
FORM_DEPO_PATHorINTAKE_PATHis unset, defaults to<project_root>/Agent/form-depoand<project_root>/Agent/intake; set the env vars to override. The design is that in intake mode the bot only does form selection and field collection; the API loads forms and writes to the intake folder.
- Set intake mode (e.g.
8.2 When --FORMS Is Not Set
- Agent runs as today: domain RAG, Pipecat, no form-depo, no intake folder. No change to existing behaviour.
9. Request and Response (Intake Mode)
In intake mode the Agent typically handles conversations via WebSocket or Pipecat. The exact message shape can match the existing chat/voice protocol; the following is the logical contract for intake.
9.1 Form Selection Phase
- Request: User message (text or STT).
- Response: Either (a) a list of form options asking the user to choose, or (b) direct introduction to the chosen form and the first question. No
form_schemaneeds to be sent by the client; forms come from form-depo.
9.2 Field Collection Phase
- Request: User message. The server must know
form_id,form_state, andsession_start_timestamp(from the form-selection response or from the client on continue). - Response:
answer,form_id,form_state; when the intake file was created on form selection, alsosession_start_timestamp(so the client can send it on every continue). When the form is submitted:form_submitted: true,form_data: form_state. The intake file is created on form selection and updated onrecord_fieldand onsubmit_formas described in §4.
9.3 Session and Continuation
- The Agent (or client) must track per session:
form_id,form_state,session_start_timestamp(required so we can read/write the same intake file). The client must sendsession_start_timestampon every/query/intake/continue(it is returned in the response when the file is first created on form selection).
10. Folder Layout (Summary)
domain-chatbot/
Agent/
form-depo/ # Form definitions (JSON), one file per form
intake/ # Intake data: {form_id}_{YYYYMMDD_HHmmss}.json
main.py # Reads --FORMS, switches to intake mode
LLM_full/
query/
service_formintake.py # Main intake service (record_field, submit_form, intake writes)
service_vector.py # Unchanged; handles /query, /query/continue (non-intake)
models_intake.py # Intake request/response models
router.py # Adds /query/intake, /query/intake/continue
...
run_agent.py # Entrypoint; --FORMS in sys.argv passed to Agent.main
- form-depo: input (form schemas).
- intake: output (partial and final intake data, timestamped, prefixed by form name).
- service_formintake.py: main service for intake; does not replace or modify
service_vector.py.
11. Detail Design: Files to Add and Modify
This section lists every file to add or modify. The goal is to minimize changes and to avoid breaking existing behaviour: service_vector.py is not modified and continues to handle the existing /query and /query/continue flows. The main service for form intake is service_formintake.py (new): it is used only by the new intake routes and does not replace or import from service_vector.py.
11.1 New Files
| File | Purpose |
|---|---|
LLM_full/query/service_formintake.py | Main form-intake service. Implements: load forms from FORM_DEPO_PATH; build form-filling and form-selection instructions; on form selection: create intake file with form_state: {}, set session_start_timestamp, include it in the response; handle record_field and submit_form (validation and intake-folder writes); process_intake_query and process_intake_continue. Uses INTAKE_PATH for {form_id}_{session_start_timestamp}.json. Does not import from service_vector.py. No RAG, no file_search. |
LLM_full/query/models_intake.py | Intake-specific Pydantic models: IntakeQueryRequest, IntakeContinueRequest, and any response helpers. IntakeContinueRequest must include session_start_timestamp (returned when the file is created on form selection). Response shape includes session_start_timestamp when an intake file is created. Keeps LLM_full/query/models.py unchanged. |
Agent/form-depo/.gitkeep | Ensures Agent/form-depo/ exists in the repo. |
Agent/form-depo/contact_form.json (optional) | Example form definition for development; can be omitted if not needed. |
Agent/intake/.gitkeep | Ensures Agent/intake/ exists; actual intake files are created at runtime. |
11.2 Modified Files (Minimal, Non-Breaking)
11.2.1 LLM_full/query/router.py
- Add only; do not change any existing route or handler.
- Add import:
from .service_formintake import process_intake_query, process_intake_continue - Add import:
from .models_intake import IntakeQueryRequest, IntakeContinueRequest(or the exact model names used). - Add two new routes:
POST /intake→process_intake_query(withIntakeQueryRequest,Request,Depends(get_current_user)). Path under the existingquery_routerbecomes/query/intake.POST /intake/continue→process_intake_continue(withIntakeContinueRequest,Request,Depends(get_current_user)). Path becomes/query/intake/continue.
- Do not change
POST /,POST /stream,POST /continue,POST /continue/stream, or any import fromservice_vector.
11.2.2 LLM_full/settings.py
- Add two optional settings only; do not change any existing key or
requiredflag.FORM_DEPO_PATH:env_manager.get_str("FORM_DEPO_PATH", required=False, default="")INTAKE_PATH:env_manager.get_str("INTAKE_PATH", required=False, default="")
- Add corresponding
@propertyor accessors. When the env value is unset or empty, the accessor returns the default:<project_root>/Agent/form-depo(resp.<project_root>/Agent/intake), whereproject_rootis the domain-chatbot root (e.g. derived from__file__or cwd). When the env is set, use that value. SetFORM_DEPO_PATHorINTAKE_PATHonly to override the default.
11.2.3 Agent/main.py
- At startup, before
create_websocket_server()andcreate_webrtc_runner(): setINTAKE_MODE = "--FORMS" in sys.argv(or equivalent). - Pass
intake_mode=INTAKE_MODEintocreate_websocket_server(...)andcreate_webrtc_runner(...)(or the relevant runner factory). If those functions do not yet acceptintake_mode, add an optional parameterintake_mode=Falseso existing callers remain valid. - Do not change any other startup or runner logic.
11.2.4 Agent/server/websocket.py
- Add optional parameter
intake_mode: bool = Falsetocreate_websocket_server(...)(or to the factory that creates the WebSocket app). If the function does not take parameters, introduce an optionalintake_mode=Falseand pass it through to the connection handler that creates the session/bot. - In the connection handler that creates
ChatSessionLite(or the object that calls the backend): passintake_mode=intake_modeinto that constructor. - Do not change the rest of the WebSocket protocol or existing non-intake behaviour.
11.2.5 Agent/core/session.py
- Add optional parameter
intake_mode: bool = FalsetoChatSessionLite.__init__(...). Pass it through toAventoraChatbot(..., intake_mode=intake_mode). - Do not change
process_messageor other method signatures for non-intake callers; only add an optional argument with defaultFalse.
11.2.6 LLM/aventora_chatbot_d.py
- Add optional parameter
intake_mode: bool = FalsetoAventoraChatbot.__init__(...). Store it on the instance. - Where the bot calls the API (
query_info,query_info_stream,continue/continue_stream, or equivalent): branch onintake_mode. IfTrue, call the intake API (e.g./query/intakeand/query/intake/continue) instead of/queryand/query/continue. Use the same auth and session pattern as today; only the URL path (and possibly request body shape for intake) changes. - Do not change behaviour when
intake_modeisFalse.
11.2.7 LLM/api_client_openai_d.py
- Add new methods (do not change existing
query_info,query_info_stream,continue, etc.):query_intake_info(...)—POST {BACKEND_URL}/query/intakewith intake-specific payload (e.g.name,email,question,language,form_id?,form_state?,session?as needed bymodels_intake).continue_intake(...)—POST {BACKEND_URL}/query/intake/continuewithIntakeContinueRequest-style payload.
- Optionally add streaming variants if needed later; the design can assume non-stream intake first.
AventoraChatbotwhenintake_mode=Truecallsquery_intake_infoandcontinue_intakeinstead ofquery_infoandcontinue.
11.3 Files Explicitly Not Modified
| File | Reason |
|---|---|
LLM_full/query/service_vector.py | Remains the handler for POST /query and POST /query/continue (and their stream variants). Intake uses service_formintake.py only. No imports from service_vector in service_formintake; no changes to service_vector. |
LLM_full/query/models.py | QueryRequest and ContinueRequest stay unchanged. Intake models live in models_intake.py. |
LLM_full/main.py | The new intake routes are on the existing query_router (/query/intake, /query/intake/continue). No new router or include_router change. |
run_agent.py | sys.argv already contains --FORMS when run as python run_agent.py --FORMS; Agent.main.main() reads it. No change. |
11.4 Optional or Follow-On Wiring
- Pipecat / WebRTC: If the Pipecat pipeline (or WebRTC runner) uses a different code path to call the backend (e.g.
Agent/audio/processors.py,Agent/server/webrtc.py), that path must also use/query/intakeand/query/intake/continuewhenintake_modeisTrue. The sameintake_modeflag should be passed frommain()into that runner/processor. The exact file(s) to modify depend on where the HTTP call is made; the design assumes it is either shared withChatSessionLite/AventoraChatbotor receivesintake_modefrom the same place. - Streaming for intake:
service_formintakecan implementprocess_intake_query_streamandprocess_intake_continue_streamlater; the router can addPOST /intake/streamandPOST /intake/continue/streamwhen needed. Not required for the first version.
11.5 Summary Table
| Action | File |
|---|---|
| Add | LLM_full/query/service_formintake.py |
| Add | LLM_full/query/models_intake.py |
| Add | Agent/form-depo/.gitkeep (and optionally contact_form.json) |
| Add | Agent/intake/.gitkeep |
| Modify | LLM_full/query/router.py (new routes and imports only) |
| Modify | LLM_full/settings.py (add FORM_DEPO_PATH, INTAKE_PATH) |
| Modify | Agent/main.py (--FORMS, intake_mode, pass to runners) |
| Modify | Agent/server/websocket.py (intake_mode param, pass to session) |
| Modify | Agent/core/session.py (intake_mode param, pass to bot) |
| Modify | LLM/aventora_chatbot_d.py (intake_mode, use intake API when True) |
| Modify | LLM/api_client_openai_d.py (new query_intake_info, continue_intake) |
| Do not modify | LLM_full/query/service_vector.py, LLM_full/query/models.py, LLM_full/main.py, run_agent.py |
12. Resolved Open Points (All Decided)
| Topic | Question / decision |
|---|---|
Decided: When FORM_DEPO_PATH unset, use <project_root>/Agent/form-depo; set env to override. | |
Decided: When INTAKE_PATH unset, use <project_root>/Agent/intake; set env to override. | |
Decided: Create the file and set session_start_timestamp on form selection (before the first record_field). Response includes session_start_timestamp; client must send it on every continue. | |
Decided: One write per user turn — after all record_field and any submit_form for that turn. record_field only updates form_state; the orchestrator writes once at end of turn. submit_form does its own final write when it runs. | |
record_field | Decided: Multiple record_field per message allowed; one write at end of the turn (orchestrator). |
Decided: Overwrite via record_field only. No update_field tool. For corrections, call record_field with the same field_id and the new value; the handler overwrites form_state[field_id]. | |
Decided: Natural language + client. No cancel_form tool. LLM recognizes "cancel"/"never mind" and acknowledges. Client clears form_id, form_state, and session_start_timestamp. Intake file stays as last partial write (no completed_at). | |
Decided: Store null in form_state. When the user skips an optional field, set form_state[field_id] = null. Use record_field(field_id, null); handler accepts null for optional fields only. Downstream can distinguish "skipped" vs "never asked"; consumers must handle null. | |
Decided: Leave to the model. Do not inject "field 3 of 8" or similar. Provide form_schema, form_state, and instructions to ask the next required empty field; the model may infer and mention progress from form_state and fields if it chooses. | |
Decided: Agent calls domain-chatbot /query/intake and /query/intake/continue. In --FORMS mode the Agent (WebSocket, Pipecat, etc.) calls these instead of /query and /query/continue. service_formintake in domain-chatbot runs the LLM, tools, and intake writes. No local/simplified LLM loop in the Agent for intake. |
13. Example End-to-End (Multiple Forms, Partial Data)
1) User starts, two forms in form-depo
- User: “I want to send a message.”
- Bot uses
intent_instructions; both “Contact” and “Feedback” match. Bot: “Do you want the Contact form or the Feedback form?” (usingshort_label). - User: “Contact.”
- Bot loads
contact_form, createsAgent/intake/contact_form_20250125_143022.jsonwithform_id,started_at,updated_at,form_state: {}. The response includessession_start_timestamp(e.g.20250125_143022) so the client can send it on continue. Says introduction and: “What is your full name?”
2) User gives name
- User: “John Doe.”
- Model calls
record_field("full_name","John Doe"). Handler updatesform_state = { "full_name": "John Doe" }(does not write). Orchestrator writes once at end of turn tocontact_form_20250125_143022.json(partial, nocompleted_at). - Bot: “Thanks. What is your email?”
3) User gives email and message
- User: “j@example.com. My message is: I have a question.”
- Model calls
record_field("email","j@example.com")thenrecord_field("message","I have a question."). Handler updatesform_statefor each (does not write);form_state= all three. Model callssubmit_form; its handler addscompleted_atand writes the final object tocontact_form_20250125_143022.json(the write for this turn; no separate orchestrator write whensubmit_formruns). - Bot: “Form submitted. Thank you.”
At every step after the first record_field, Agent/intake/contact_form_20250125_143022.json contains up-to-date partial (or finally complete) data.
Add clarifications and decisions in the document as you refine.