3 Commits

Author SHA1 Message Date
herzogflorian
12f9e3ac9b Add LLM parameter controls to sidebar
Thinking mode toggle, temperature, max tokens, top_p, and presence
penalty sliders in the Streamlit sidebar. Parameters apply to both
chat and file editor generation.

Made-with: Cursor
2026-03-02 16:41:05 +01:00
herzogflorian
9e1e0c0751 Add Streamlit chat app, update container to vLLM nightly
- Add app.py: Streamlit UI with chat and file editor tabs
- Add requirements.txt: streamlit + openai dependencies
- Update vllm_qwen.def: use nightly image for Qwen3.5 support
- Update README.md: reflect 35B-A3B model, correct script names
- Update STUDENT_GUIDE.md: add app usage and thinking mode docs
- Update .gitignore: exclude .venv/ and workspace/

Made-with: Cursor
2026-03-02 16:30:04 +01:00
herzogflorian
076001b07f Add vLLM inference setup for Qwen3.5-35B-A3B on Apptainer
Scripts to build container, download model, and serve Qwen3.5-35B-A3B
via vLLM with OpenAI-compatible API on port 7080. Configured for 2x
NVIDIA L40S GPUs with tensor parallelism, supporting ~15 concurrent
students.

Made-with: Cursor
2026-03-02 14:43:39 +01:00