Package the Agent, Not Just the Model: Native Skills Support in KitOps

Introduction

AI agents have improved significantly on autonomous coding tasks over the past year, and as a result, tools like Claude Code, Cursor, and OpenCode have become central to how software teams work. These tools extend the capabilities of AI models and connect them to external resources through agent skills, MCP servers, plugins, and more. The problem is that an agent's behavior is now a combination of model weights, prompts, skill files, MCP configs, and more, all of which need to be versioned and tracked together if you want reproducible behavior and any hope of debugging failures in production.

KitOps has previously enabled AI/ML teams to package model weights, code, and datasets as versioned, OCI-compliant artifacts called ModelKits. With the new KitOps v1.12.0 release, native support has been extended to agentic components (skills, configs, etc.) as a tracked layer in your ModelKit. In this tutorial, we'll walk through packaging a PDF Agent Skill, OpenCode config, and model GGUF as a versioned ModelKit, pushing it to Jozu Hub, and using it locally with a Jozu Rapid Inference Container (RIC).

Why Package Agents in a Model Registry?

AI/ML projects are made up of many interdependent assets: model weights, datasets, agent skills, MCP configs, prompts, licenses, and docs. All of these evolve independently across experiments and fine-tuning runs. When something breaks in production (and with agents, something always eventually breaks), your team needs to be able to roll back and answer three questions: what exact combination of assets was deployed, what changed between the version that worked and the one that didn't, and who made the change.

Jozu Hub is a model registry built around that problem. Unlike general-purpose container registries, it's designed for the realities of AI projects: large binary weights, paired datasets, prompts, code, and now agent skills and MCP configs that all change on different cadences and need to be tracked together.

At the heart of Jozu Hub is the ModelKit, an OCI-compliant packaging format that bundles every asset in your AI/ML project into a single versioned package. Think of it like a Git commit for your entire ML project, but one that container registries already understand natively, so it slots into the same storage, access control, and CI/CD infrastructure your team already runs.

Key benefits

Selective unpacking: Pull only what you need (just the weights, just the dataset, just the skills). Faster pipelines, less compute overhead, less data movement.
No duplication for shared assets: Common datasets or configs can be reused across ModelKits without bloating storage.
Version control across the whole project: All artifacts are versioned and bundled together, with familiar registry-native tags (:latest, :staging, :prod).
Standards-based and portable: OCI-compliant, so ModelKits work with any container registry. No vendor lock-in.

Governance and security

For organizations in regulated industries, Jozu Hub builds in:

Chain of Custody: Full audit trail for every ModelKit, traceable from creation through production deployment. Critical for incident response and compliance with regulations and frameworks like the EU AI Act, GDPR, and SOC 2.
Security Scanning and Policy Gates: Automatic scans for vulnerabilities, license issues, and policy violations, with configurable rules to block deployments, require manual approval, or enforce specific policies.
Tamper-Evident Packaging: ModelKits can be cryptographically signed with Cosign. Any unauthorized change to weights, datasets, or any other artifact is detected immediately through signature verification.

Jozu Rapid Inference Containers (RICs)

Once your LLM ModelKit is in Jozu Hub, you don't need to write custom inference API code to serve the model. Jozu generates Rapid Inference Containers (RICs) automatically. RICs are pre-configured, optimized inference containers built directly from your ModelKit and ready to serve in production.

Zero configuration: No Dockerfiles, no server config, no inference API setup. Jozu handles it based on your model format and metadata. GGUF gets a llama.cpp RIC; Safetensors gets a vLLM RIC.
Optimized for inference: Tuned for inference workloads, no dependency bloat.
Kubernetes-ready: Ships with deployment YAML for dropping into existing clusters.
Faster spin-up: Pre-built and cached, so launching a new model server is significantly faster than building a custom inference image from scratch. Internal Jozu benchmarks measure 7x faster than typical custom-built inference containers.

Testing our Local Agent's PDF Form Filling Skill

Setup and Installation guide

The hardware requirements, tooling, and the process for importing the Qwen-3.5 GGUF model from HuggingFace, along with installation of the KitOps CLI, Python dependencies, and OpenCode, have already been covered in a previous tutorial (Running a Local Coding Agent with OpenCode and Jozu Rapid Inference Container (RICs)).

Once you've completed that setup, are authenticated with Jozu Hub, and have downloaded the model GGUF, return here to continue with the next steps.

Packaging and Publishing as a ModelKit to Jozu Hub

In this tutorial, we'll use the PDF agent skill published by Anthropic. You can download the ZIP file here: mcpservers.org/agent-skills/anthropic/pdf. After downloading, extract the contents and move them to your project directory.

Next, we need an opencode.json config file in the project directory to point OpenCode at the Qwen 3.5 GGUF model we'll be running locally. Create the file and paste in the following:

{
  "$schema": "https://opencode.ai/config.json",
  "provider": {
    "llama_server": {
      "npm": "@ai-sdk/openai-compatible",
      "name": "llama-server (local)",
      "options": {
        "baseURL": "http://localhost:8000/v1"
      },
      "models": {
        "qwen3.5-9B-q4_0.gguf": {
          "name": "qwen3.5-9b-q4_0-gguf (local model)",
          "limit": {
            "context": 128000,
            "output": 65536
          }
        }
      }
    }
  }
}

At this point your working directory should contain the Qwen 3.5 GGUF (from the previous tutorial), the PDF agent skill, and the OpenCode config. Now we're ready to package everything as a ModelKit.

KitOps v1.12.0 adds native agent skill detection to kit init: when a directory contains a SKILL.md file, KitOps automatically generates a Kitfile that packages that directory as a prompt layer inside the ModelKit, with the skill's name and description pulled directly from the SKILL.md frontmatter.

To see this in action, run kit init . on your working directory and KitOps will generate and display the Kitfile automatically:

For this tutorial, though, we'll use the KitOps Python package instead, which gives us more control over how assets are organized and labeled inside the ModelKit.

from kitops.modelkit.kitfile import Kitfile

# Create a new Kitfile
kitfile = Kitfile()

# Set basic metadata
kitfile.manifestVersion = "1.0"
kitfile.package = {
    "name": "Qwen3.5-9B-Q4_0-GGUF",
    "version": "1.0",
    "description": "Kitfile for Qwen3.5-9B-Q4_0-GGUF"
}

# Configure model information
kitfile.model = {
    "name": "Qwen3.5-9B-Q4_0-GGUF",
    "path": "Qwen3.5-9B-Q4/Qwen3.5-9B-Q4_0.gguf", # path to the GGUF file.
    "version": "1.0",
    "license": "Apache 2.0",
    "description": "Q4_0 GGUF file"
}

kitfile.prompts = [
    {
        "path": "pdf",
        "description": "Use this skill whenever the user wants to do anything with PDF files. This includes reading or extracting text/tables from PDFs, combining or merging multiple PDFs into one, splitting PDFs apart, rotating pages, adding watermarks, creating new PDFs, filling PDF forms, encrypting/decrypting PDFs, extracting images, and OCR on scanned PDFs to make them searchable. If the user mentions a .pdf file or asks to produce one, use this skill."
    }
]

kitfile.code = [
    {
        "path": "opencode.json",
        "description": "OpenCode Config file"
    }
]

kitfile.docs = [
    {
        "path": "registration_form.pdf",
        "description": "Sample Registration form to test form filling PDF skill"
    }
]

# You can also add other information like below:

# kitfile.datasets = [
#     {
#         "name": "dataset",
#         "path": "data/sample.csv",
#         "description": "full dataset",
#         "license": "Apache 2.0"
#     }
# ]

# For more information on what you can add, see https://kitops.org/docs/pykitops/how-to-guides/

# You can save the Kitfile locally (Note: the Kitfile specifies how the ModelKit will be bundled)
kitfile.save("Kitfile")

Then we publish the ModelKit to Jozu Hub:

from kitops.modelkit.manager import ModelKitManager, UserCredentials

# Configure the ModelKit manager
# Note: the email prefix of jack123@gmail.com is just jack123

modelkit_tag = "jozu.ml/email-prefix-here/qwen3.5-9b-q4_0-gguf:latest" # all lowercase
manager = ModelKitManager(
    working_directory=".",
    modelkit_tag=modelkit_tag,
    user_credentials=UserCredentials('full-email-here', 'password-here', namespace='qwen3.5-9b-q4_0-gguf')
)

# Assign your Kitfile
manager.kitfile = kitfile

# Pack and push to Jozu Hub
manager.pack_and_push_modelkit(save_kitfile=True)

Go to your Jozu Hub Repository to confirm the push:

Now that the ModelKit is on Jozu Hub, we can pull it from any machine or environment, and we don't have to pull the whole thing. To simulate this, create a new folder and unpack only the prompts, code, and docs:

mkdir new_folder
cd new_folder

kit pull jozu.ml/email-prefix-here/qwen3.5-9b-q4_0-gguf:latest
kit unpack jozu.ml/email-prefix-here/qwen3.5-9b-q4_0-gguf:latest --filter=prompts,code,docs

Note: kit pull only fetches the ModelKit manifest, not the actual contents. The optional --filter flag on kit unpack tells KitOps which layers to download and extract, so you only pull what you actually need.

Your new folder should now contain only the pdf/ skill directory, opencode.json, and registration_form.pdf. The model weights are absent because we filtered them out:

Launching the LLM on a Jozu Rapid Inference Container

Run the command below to start the container:

docker run -it --rm -p 8000:8000 --gpus all "jozu.ml/email-prefix-here/qwen3.5-9b-q4_0-gguf/llama-cpp-cuda:latest"

Notes:

The email prefix of jack123@gmail.com is just jack123.
You don't need any extra configuration for Jozu RICs to use your GPU. The container automatically detects and utilizes all available GPU nodes (as shown in the screenshot below).

You should see output similar to the screenshots below:

Connecting to OpenCode and Filling a Registration PDF Form

First, we need to put the PDF agent skill somewhere OpenCode can find it. The exact location depends on the agent framework, but for OpenCode the convention is a .opencode/skills/ directory in your project root. Create that folder structure and move the PDF skill inside it so the final path is .opencode/skills/pdf.

Now start opencode, type /models in the input, move down (↓), and select qwen3.5-9b-q4_0-gguf (local model).

Finally, as shown in the demo below, we can ask the agent to fill out registration_form.pdf and it will use the PDF skill to do that.

Conclusion

Recap of What Was Built

In this tutorial, we walked through packaging and versioning a PDF agent skill, an OpenCode config, and a model GGUF as a single ModelKit, pushing it to Jozu Hub, and spinning it up locally with a Jozu Rapid Inference Container. We then used it in OpenCode to fill a sample registration PDF form.

Additional Resources / What's Next

If you want to go deeper into local LLM deployments, AI coding agents, and production-grade AI/ML workflows with Jozu Hub, the resources below are a great place to continue:

Jozu Blog (best practices, architecture deep dives, and production guidance): https://jozu.com/blog
KitOps Documentation (detailed guides on ModelKits, versioning, and ML artifact management): https://kitops.org/docs
Jozu Hub Docs (RICs, audits, governance, deployments, etc.): https://jozu.ml/docs
OpenCode Documentation (providers, configuration, and advanced agent workflows): https://opencode.ai/docs

Head over to jozu.com to explore the public model catalog and start running your own local AI workflows today.

Note: This blog has an accompanying GitHub repository that contains the code, PDF agent skill, OpenCode config, and the sample registration PDF form used for this tutorial. Check it out at https://github.com/Studio1HQ/agents_with_kitops/.

Share this post