6 Commits

Author SHA1 Message Date
herzogflorian
f4fdaab732 Add Open WebUI integration and enhance Streamlit app
- Add Open WebUI scripts (06-09) for server-hosted ChatGPT-like interface
  connected to the vLLM backend on port 7081
- Add context window management to chat (auto-trim, token counter, progress bar)
- Add terminal output panel to file editor for running Python/LaTeX files
- Update README with Open WebUI setup, architecture diagram, and troubleshooting
- Update STUDENT_GUIDE with step-by-step Open WebUI login instructions

Made-with: Cursor
2026-03-02 18:48:51 +01:00
herzogflorian
d59285fe69 Update student guide with full app.py documentation
Add clone/venv setup instructions, feature descriptions for both tabs,
sidebar parameter table, and clarify that files stay local.

Made-with: Cursor
2026-03-02 16:43:21 +01:00
herzogflorian
deee5038d1 Update README to reflect current project state
Add Streamlit app section with setup, usage, and sidebar controls.
Document nightly Docker image requirement, scp workflow for server
sync, and practical troubleshooting tips from setup experience.

Made-with: Cursor
2026-03-02 16:42:33 +01:00
herzogflorian
12f9e3ac9b Add LLM parameter controls to sidebar
Thinking mode toggle, temperature, max tokens, top_p, and presence
penalty sliders in the Streamlit sidebar. Parameters apply to both
chat and file editor generation.

Made-with: Cursor
2026-03-02 16:41:05 +01:00
herzogflorian
9e1e0c0751 Add Streamlit chat app, update container to vLLM nightly
- Add app.py: Streamlit UI with chat and file editor tabs
- Add requirements.txt: streamlit + openai dependencies
- Update vllm_qwen.def: use nightly image for Qwen3.5 support
- Update README.md: reflect 35B-A3B model, correct script names
- Update STUDENT_GUIDE.md: add app usage and thinking mode docs
- Update .gitignore: exclude .venv/ and workspace/

Made-with: Cursor
2026-03-02 16:30:04 +01:00
herzogflorian
076001b07f Add vLLM inference setup for Qwen3.5-35B-A3B on Apptainer
Scripts to build container, download model, and serve Qwen3.5-35B-A3B
via vLLM with OpenAI-compatible API on port 7080. Configured for 2x
NVIDIA L40S GPUs with tensor parallelism, supporting ~15 concurrent
students.

Made-with: Cursor
2026-03-02 14:43:39 +01:00