LLM_Inferenz_Server_1/.gitignore at 076001b07f20c8123d92ce9237dad23ac5e40e7c - LLM_Inferenz_Server_1 - FHGR Git

herzogfloria/LLM_Inferenz_Server_1

herzogflorian 076001b07f Add vLLM inference setup for Qwen3.5-35B-A3B on Apptainer

Scripts to build container, download model, and serve Qwen3.5-35B-A3B
via vLLM with OpenAI-compatible API on port 7080. Configured for 2x
NVIDIA L40S GPUs with tensor parallelism, supporting ~15 concurrent
students.

Made-with: Cursor

2026-03-02 14:43:39 +01:00

15 lines

160 B

Plaintext

Raw Blame History

 # Apptainer container image (large binary)
 *.sif
 # Logs
 logs/
 # Model weights (downloaded separately)
 models/
 # HuggingFace cache
 .cache/
 # macOS
 .DS_Store