palmetto-apptainer-libcuda-fix
CommunityRepair CUDA driver visibility in Palmetto jobs.
AuthorKwongFuk
Version1.0.0
Installs0
System Documentation
What problem does it solve?
This skill fixes container-side CUDA driver visibility for Palmetto Slurm jobs when Apptainer or vLLM workers fail with a missing libcuda error, even though the host provides /lib64/libcuda.so.1. It enables a safe repair plus a smoke validation step before resubmission.
Core Features & Use Cases
- Patch the sbatch launcher to create a scratch-local CUDA compatibility directory and symlink the host libcuda into the container path used by the worker.
- Bind the compatibility path into the container and adjust LD_LIBRARY_PATH for the preflight and real runs.
- Add a lightweight preflight and a short smoke test to verify that libcuda.so and libcuda.so.1 can be loaded by a Python process inside the container before resubmitting the real job.
Quick Start
Apply the patch to the sbatch launcher and run the included smoke test before resubmitting the original job.
Dependency Matrix
Required Modules
None requiredComponents
Standard package💻 Claude Code Installation
Recommended: Let Claude install automatically. Simply copy and paste the text below to Claude Code.
Please help me install this Skill: Name: palmetto-apptainer-libcuda-fix Download link: https://github.com/KwongFuk/codex-skills/archive/main.zip#palmetto-apptainer-libcuda-fix Please download this .zip file, extract it, and install it in the .claude/skills/ directory.
Agent Skills Search Helper
Install a tiny helper to your Agent, search and equip skill from 471,000+ vetted skills library on demand.