Uncategorized - 8 MINUTE READ

What’s Wrong with Your Kserve Setup (and How to Fix It)

Brad Micklea avatar
Brad Micklea

TL;DR: You're storing ML models in S3 and deploying them with Kserve. That's fine until someone asks: "Who deployed this model? Is it secure? Can we rollback?" Then you realize you have no answers. Jozu fixes this with by adding the security and governance layer enterprises need with: Kserve model versioning Kserve security scanning Kserve […]

TL;DR: You're storing ML models in S3 and deploying them with Kserve. That's fine until someone asks: "Who deployed this model? Is it secure? Can we rollback?" Then you realize you have no answers. Jozu fixes this with by adding the security and governance layer enterprises need with:

  • Kserve model versioning
  • Kserve security scanning
  • Kserve rollback strategies
  • Kubernetes ML governance

The Setup Everyone Uses

Here's how 90% of teams deploy models to Kserve:

  1. Train model
  2. Upload to S3
  3. Point Kserve at S3 path
  4. Deploy

It's simple and it works. Until it doesn't.

When Things Go Wrong

Below are just two real world examples of what happens when you treat models like static files instead of critical software.

Kserve Rollback Problems

Last week, a customer told me their production model started returning biased predictions. They needed to rollback.

Problem: They had 47 model files in S3. No versioning scheme just files names. No deployment history. They guessed wrong twice before finding the right version.

S3 Rollback Process:

  1. Query S3 objects by prefix
  2. Parse metadata from each object (because you can't necessarily trust the filename)
  3. Find version with desired accuracy metrics
  4. Update InferenceService manifest
  5. Apply changes via Kserve

The Process You Want:

  1. Select last good version in Jozu Hub by tag or SHA
  2. Apply changes via Kserve

Pro Tip
If you're still stuck with S3, enforce a strict naming convention with semantic versions and commit SHAs (e.g., fraud-detector_v1.2.3_sha256abcd.pkl). It won't fix governance or mutability, but it makes emergency rollbacks faster than relying only on file timestamps or guesswork.

Security Risks in Kserve Deployments

Another team discovered their model contained a dependency with a known CVE. It had been running in production for three months. They had no way to know which other models had the same vulnerability.

S3 Investigation Process:

  1. Hope you find the vulnerability before it goes to production...
  2. Extract dependencies from S3 file metadata
  3. Query external CVE database
  4. Scan S3 bucket for impacted models
  5. Add impacted models, CVE info, and S3 URL to a spreadsheet
  6. Cross reference with deployment history to see which went to production

The Process You Want:

  1. Do nothing special because models should be automatically scanned

Vulnerable models shouldn't get to production, but if they do you want scan results, approvals, and audit logging so you can quickly rollback and prevent future issues.

What You Need for Enterprise AI Workloads

We designed Jozu after struggling with production operations for our AI/ML projects on Kubernetes:

Comparison: S3 + Kserve versus Jozu + Kserve

Capability S3 + Kserve (Today) Jozu + Kserve (Enterprise-Ready)
Versioning Filenames (model_v2_final.pkl) - mutable and inconsistent Immutable versions with semantic tags + SHA digests
Security Scanning None. Manual CVE checks at best Automatic scans for CVEs, code injection, and LLM-specific risks
Governance Anyone with Kserve access can deploy anything Policy enforcement, approvals, environment restrictions
Auditability No history, no lineage Full audit trails: who deployed what, when, and why
Rollback Error-prone manual file hunting in S3 Deterministic rollback to any previous version
Compliance No controls, no reports HIPAA, GDPR, SOC2, NIST-ready with attested artifacts
Integration Manual Works with all MLOps and DevOps tools

There are a lot of big MLOps claims right now, so let's be specific about what's missing:

1. Security Scanning

You need automatic scanning for:

  • CVEs in ML model dependencies
  • Code injection vulnerabilities
  • Common LLM risks (e.g., the OWASP Top 10)

Without scanning, you're deploying blind.

Pro Tip
Even without a dedicated model security layer, you can integrate dependency scanning into your CI/CD pipeline (e.g., pip-audit, safety, or trivy). It's manual and brittle when models are pulled directly from S3, but it's better than deploying blind.

2. Real Versioning

Not model_v2_final_REALLY.pkl in S3.

You need:

  • Immutable versions with semantic tags and SHA digests
  • Cryptographic signatures so you know who / what signed off
  • Tamper-proof attestations for security scan results and human / pipeline approvals
  • Clear lineage from training to production

S3 objects are mutable. If someone changes a model, accidentally or intentionally - you may not know until it's too late.

3. Deployment Control

Right now, anyone with Kserve access can deploy any model to production. No checks. No approval. No policies.

You need:

  • Security gates (only passed models should go through)
  • Approval workflows
  • Environment restrictions for PII, HIPAA, NIST and other regulated workloads
  • Audit trails

4. Debugging Capability

When production fails, you need answers fast:

  • What version is deployed?
  • What was the last good version we can rollback to?
  • What changed from the previous version?
  • Who approved deployment?
  • What were the scan results?
  • What's the full lineage?

S3 gives you none of this.

Pro Tip
Store inference request / response samples alongside your model version, it's far from perfect, but future you will appreciate it when debugging production drift.

5. Native Kserve Integration

All of this needs to work through Kserve's existing storage initializer mechanism. No changes to your InferenceService specs beyond updating the storageUri.

The security and governance needs to happen transparently in the background.

If it relies on manual effort or non-standard integrations it will be forgotten or broken.

How Jozu Solves This

Jozu is purpose-built to provide exactly these capabilities. It's not just a model registry - it's a security and governance layer designed specifically for production ML deployments.

Jozu has helped organizations in finance, healthcare, telecommunications, government, and logistics to reduce their AI project delivery time by >41% while adding the missing security and governance layers they needed.

Here's how Jozu transforms your Kserve infrastructure:

Industry standard ModelKits - Jozu uses KitOps ModelKits which are built on a CNCF standard and leverage OCI Artifacts behind the scenes. ModelKits are immutable packages containing model, code, datasets, and configurations. Each one is versioned, signed, and scanned.

Security happens automatically - Every ModelKit pushed to Jozu is scanned for vulnerabilities. Failed scans block deployment. No manual process needed.

Governance becomes enforceable - Set policies: "Production requires security scan pass + manager approval." Jozu enforces it. Non-compliant deployments fail.

Debugging becomes possible - Full audit trail. Version diffs. Deployment history. When something breaks, you have data, not guesses.

The Technical Details

Integration is straightforward. Kserve already supports custom storage initializers. You just add Jozu's:

apiVersion: serving.kserve.io/v1alpha1
kind: ClusterStorageContainer
metadata:
  name: jozu-storage
spec:
  container:
    name: storage-initializer
    image: ghcr.io/kitops-ml/kitops-kserve:latest
    # Point to the latest container or pin to a SHA digest
    env:
    - name: JOZU_REGISTRY_URL
      value: "https://your-jozu-instance.com"
  supportedUriFormats:
  - prefix: jozu://

Now your InferenceService specs change slightly:

Before (S3):

spec:
  predictor:
    model:
      storageUri: "s3://models/fraud-detector/model.pkl"
      # No versioning, no scanning, no governance

After (Jozu):

spec:
  predictor:
    model:
      storageUri: "jozu://fraud-detector:v2.1.3"
      # Versioned, scanned, cached, governed

That's it. Same Kserve. Better security.

What Happens During Deployment

  1. Kserve creates pod with init container
  2. Init container authenticates with Jozu
  3. Verifies signatures and security status
  4. Unpacks ModelKit
  5. Model serves as normal
  6. Jozu workflows can automatically update the tags for production & rollback

The security and governance happen transparently. Your data scientists don't change their workflow. Your deployments get faster and smoother. Your security team stops worrying.

The Payoff

Faster incident response: Model acting weird? Check the diff between current and previous version. See exactly what changed and rollback if needed.

Compliance ready: Auditor wants deployment history? Here's every model, who deployed it, when, and what security checks passed. All in a downloadable (and readable) audit log.

Reliable rollbacks: One command to rollback to any previous version. No guessing. No breaking production repeatedly.

Security visibility: Dashboard shows which models have vulnerabilities. Fix them before hackers find them.

Getting Started

Two paths: for production and for hobbyists.

Production Setup

Deploy self-hosted Jozu:

  • Full security scanning
  • Deployment policies
  • Audit logging
  • Model comparisons
  • SLA support

Connect your pipelines:

Quick Test for Hobbyists

Try it with free tier at jozu.ml (note that it has no security scanning, policies, or audit logging - self-hosted Jozu is required for those features).

One Last Question

Your models make critical decisions. They handle sensitive data. They affect revenue.

Why are you treating them with less security than your regular code?

Kserve handles serving beautifully. But beautifully serving insecure, unversioned, ungoverned models doesn't win you a pat on the back.

Jozu fixes the security and governance. Kserve handles the serving. Together, you get enterprise-ready ML.

Book a Jozu demo to see how it integrates with your Kserve setup.


Resources:

Share this post