LLM_Inferenz_Server_1/vllm_qwen.def
herzogflorian 9e1e0c0751 Add Streamlit chat app, update container to vLLM nightly
- Add app.py: Streamlit UI with chat and file editor tabs
- Add requirements.txt: streamlit + openai dependencies
- Update vllm_qwen.def: use nightly image for Qwen3.5 support
- Update README.md: reflect 35B-A3B model, correct script names
- Update STUDENT_GUIDE.md: add app usage and thinking mode docs
- Update .gitignore: exclude .venv/ and workspace/

Made-with: Cursor
2026-03-02 16:30:04 +01:00

23 lines
666 B
Modula-2

Bootstrap: docker
From: vllm/vllm-openai:nightly
%labels
Author herzogfloria
Description vLLM nightly inference server for Qwen3.5-35B-A3B
Version 3.0
%environment
export HF_HOME=/tmp/hf_cache
export VLLM_USAGE_SOURCE=production
%post
apt-get update && apt-get install -y --no-install-recommends git && rm -rf /var/lib/apt/lists/*
pip install --no-cache-dir "transformers @ git+https://github.com/huggingface/transformers.git@main"
pip install --no-cache-dir huggingface_hub[cli]
%runscript
exec python3 -m vllm.entrypoints.openai.api_server "$@"
%help
Apptainer container for serving Qwen3.5-35B-A3B via vLLM (nightly).