Hugging Face on JFrog Artifactory: An Enterprise Guide (and What Changes in June 2026)

Published May 8, 2026

A practical guide for IT, security, and ML platform teams running Hugging Face models behind a JFrog Artifactory proxy — covering the legacy → Machine Learning repository layout migration before June 2026, why proxy environments hit HTTP 429 rate limits, and when Hugging Face Enterprise Plus and Model Gateway are the right answer.

TL;DR — JFrog Artifactory can proxy Hugging Face Hub for caching, scanning, and governance, but it inherits the rate limits of whatever Hub identity you configure on the remote repository, and its Xet protocol implementation is surface-level and misses Xet's deduplication benefits — in practice it nearly doubles your storage footprint. Before June 2026, every legacy "Hugging Face" repository in Artifactory needs to be migrated to the new "Machine Learning" repository layout. For enterprises with serious AI workloads, Hugging Face Enterprise Plus provides higher rate limits, organizational SSO/SCIM identity, audit logs, and Model Gateway — a Hugging Face–native internal model registry that solves the gated-model permission problem (Llama, Gemma, Mistral) at the org level and delivers true content-addressed storage. The most resilient architecture pairs Artifactory as the universal artifact perimeter with a Hugging Face Enterprise Plus organization providing identity, governance, and the model-distribution layer.

Why this guide exists
How Artifactory supports Hugging Face today: package types and repositories
Configuring huggingface_hub and HF_ENDPOINT for Artifactory
Why proxy environments hit HTTP 429 rate limits on Hugging Face
Why personal access tokens are the wrong answer for Artifactory
How Xet protocol works through Artifactory (and where the limits are)
The June 2026 forced migration to the Machine Learning repository layout
Why Hugging Face Enterprise Plus is the right tier for proxy environments
Model Gateway: the Hugging Face–native internal model registry
When to use what: Artifactory vs. Hugging Face Enterprise Plus
Practical checklist for IT and security teams
FAQ

Why this guide exists

If you're a large enterprise, your developers want to use open models from Hugging Face. Your security team wants every artifact entering the company to pass through a controlled, scanned, audited proxy. For many organizations, the answer to both is the same tool: JFrog Artifactory, the universal artifact repository they already use for Docker images, Python packages, npm, Maven, and the rest of the software supply chain.

JFrog has invested meaningfully in Hugging Face support. You can proxy public Hugging Face models, cache them locally, scan them with JFrog Xray, and bundle them into release artifacts alongside the rest of your software. For many use cases, this is genuinely useful.

But there are real limits to what an artifact-manager-as-proxy can do for ML, and those limits are getting more visible as model sizes grow, AI adoption accelerates, and the Hugging Face platform itself evolves. This guide walks through how to use JFrog Artifactory with Hugging Face today, what changes in June 2026 with the forced migration to the new Machine Learning repository layout, and where Hugging Face Enterprise Plus — including the new Model Gateway — fits in the architecture.

Part 1 — How JFrog Artifactory supports Hugging Face today

The two repository package types you need to understand

When a JFrog admin creates a repository for Hugging Face content in Artifactory, they pick from two package-type families:

The legacy "Hugging Face" package type. This is the original integration, available since Artifactory 7.77.x (late 2023). It mirrors the Hugging Face Hub structure (revisions, branches, tags) and works with the standard huggingface_hub Python library and CLI.

The new "Machine Learning" package type. Introduced in Artifactory 7.111.1, this is a format-agnostic ML repository that stores Hugging Face content alongside other ML formats (PyTorch, ONNX, .pkl, .joblib, .pth, .cbm). It is paired with the FrogML SDK and is the only layout that supports Xet protocol, virtual Hugging Face repositories, and the dataset workflow at full fidelity. As of Artifactory 7.111.1, all newly-created Hugging Face repositories use this new Machine Learning repository layout by default.

For each, you can create three kinds of repositories:

Local — for models your team builds or fine-tunes and wants to host privately.
Remote — a cache that proxies https://huggingface.co (the only allowed upstream URL).
Virtual — a single endpoint that aggregates multiple local and remote repos. Virtual Hugging Face repositories require all members to use the new Machine Learning layout, and they resolve only via the snapshot_download API — not via library calls like from_pretrained().

Configuring `huggingface_hub` and `HF_ENDPOINT` to point at Artifactory

Once a Hugging Face repository exists in Artifactory, your data scientists configure the standard huggingface_hub client to point at Artifactory instead of huggingface.co:

export HF_ENDPOINT=https://<your-artifactory>/artifactory/api/huggingfaceml/<repo-name>
export HF_TOKEN=<artifactory-token>

After that, code like AutoModel.from_pretrained("meta-llama/Llama-3.1-8B") resolves through Artifactory rather than directly from huggingface.co. JFrog also ships a jf hf CLI that wraps huggingface_hub.snapshot_download and HfApi.upload_folder with build-info collection — useful for ML pipeline traceability. JFrog openly documents that only download and upload operations are supported in jf hf; native hf commands like auth, cache, repo, spaces, collections, endpoints, jobs, models, papers, and repo-files are not wrapped.

Where this works well

Before getting to the limits, here's how JFrog Artifactory works well as a Hugging Face proxy:

Universal artifact management. If your organization already has Artifactory for Docker, npm, PyPI, and Maven, adding Hugging Face models to the same plane is operationally clean.
Caching and supply-chain protection. Caching public models protects you from upstream deletion or modification.
Security scanning. JFrog Xray can scan models for malicious content (the JFrog Security Research team regularly publishes findings on malicious models on the public Hub).
Curation policies. JFrog Curation blocks models by license, CVE, or policy.
Release bundling. Models become part of immutable Release Bundles alongside the rest of your software.

For organizations whose ML usage is modest and whose primary concern is supply-chain security, this can be sufficient. The interesting questions arise as ML usage scales — and as we'll see in the next sections, the deeper you look at Artifactory's Hugging Face integration, the more it becomes clear that several of the headline features (Xet protocol, gated-model handling, organizational identity) are surface-level rather than architectural.

Why proxy environments hit HTTP 429 rate limits on Hugging Face

This is the issue that most often surprises enterprise teams running Hugging Face through Artifactory. Symptoms range from mysterious failures during training runs to error logs full of 429 Client Error: Too Many Requests. The Hugging Face Forums are full of these reports.

The Hugging Face Hub enforces rate limits across three buckets:

API requests — model search, repo creation, user management, and other Hub API endpoints.
Resolve requests — every URL containing /resolve/, which is what the open-source libraries (transformers, datasets, vLLM, llama.cpp, LM Studio, ollama, etc.) use to download model and dataset files.
Specific user actions — repo creation rates, dataset upload rates, etc.

Limits are calculated over rolling 5-minute windows and tied to the authenticated identity making the request. A standard developer hitting huggingface.co directly almost never sees these limits — their pattern is bursty and small.

A proxy environment is fundamentally different. When 500 developers, every CI job, every training run, every Kubernetes pod cold-start, and every cached-but-revalidating snapshot all flow through a single Artifactory remote repository, they share a single Hugging Face identity at the upstream — the token configured on the remote repo. From Hugging Face's perspective, that identity is making thousands of requests where a normal user would make a handful. The bucket fills up. Resolve calls start returning HTTP 429 Too Many Requests. Your developers see Artifactory failing to fetch a model they "know works on hf.co."

Three implications:

Per-developer Hugging Face PRO upgrades don't help the proxy. Even if every developer has their own PRO account, the proxy identity is a different account and only its tier matters.
The choice of identity for the proxy matters enormously. A free-tier personal account is the worst case. A Team or Enterprise organization-issued token is much better. An Enterprise Plus organization with IP allowlisting is best.
Workload patterns matter too. Cache-warming jobs, frequent metadata revalidation, and large multi-shard model pulls (think 70B-parameter models split across many .safetensors shards) all consume Resolve budget faster than a human would. Rolling out a new model to many training pods simultaneously can momentarily look indistinguishable from abuse.

You can monitor your rate-limit posture on the Hugging Face billing page: three live gauges show your current usage in each bucket. If any bar is regularly red, the proxy identity is undersized for the workload.

Why personal access tokens are the wrong answer

It's tempting for an admin to spin up a personal Hugging Face account, generate a user access token, paste it into the Artifactory remote repository config, and call it done. We see this constantly. Don't do this.

Single point of failure. If the admin leaves the company, the token is theirs. Rotating it requires another individual to volunteer their account.
Audit trail is owned by an individual. All organization-level traffic appears, on the Hugging Face side, as the actions of one person. There is no organizational paper trail for SOC2, ISO 27001, or internal investigations.
No fine-grained scope. Personal tokens have either read or write access to everything that user can see; they cannot be scoped to "only the models we approve."
No approval workflow. Free-tier accounts cannot enforce approval policies on tokens. Anyone with an email address can create a fresh account, generate a token, and start pulling models — bypassing whatever process you thought was in place.
Lower rate limits. A personal free-tier identity has the lowest Hub rate limits available, which is exactly the wrong choice for proxy traffic.

The right answer is a token issued by a Hugging Face organization with an enterprise plan, ideally fine-grained and subject to admin approval policies before it can be used in production.

How Xet protocol works through Artifactory (and why it doesn't deliver Xet's benefits)

Xet is Hugging Face's content-addressable storage backend for large files. The architectural point of Xet is global deduplication: files are split into content-defined chunks, chunks are grouped into xorbs (immutable, hash-addressed objects), and a chunk that appears in 100 different models — say, a tokenizer config that's identical across a model family — is physically stored once. Combined with the hf_xet Python client, this is dramatically faster and dramatically cheaper than Git LFS for the multi-gigabyte files modern models require.

JFrog Artifactory has added Xet "support," but the implementation does not deliver Xet's actual benefits. We'll explain what we found below — IT and platform teams evaluating Artifactory's Xet support should understand exactly what they're getting before sizing storage.

The documented constraints

JFrog's own documentation lists three constraints on Xet:

Xet is supported only on the new Machine Learning repository layout. The legacy Hugging Face layout cannot use Xet.
Xet is supported only for remote repositories. Local Artifactory Hugging Face repositories — the ones you use to host private fine-tunes — do not benefit from Xet at all.
Xet does not work with download redirection. If you've configured your Artifactory remote to redirect downloads (a common pattern for offloading bandwidth), Xet is disabled for that repo.

These are the published limits. The deeper issue is what happens inside the Xet "support" that is shipped.

What we observed: Xet without deduplication

We took a close look at how Artifactory actually stores Xet content under the hood. Three findings:

1. No deduplication. Xorbs are stored under paths like models/{repo}/{rev}/model/.xet/{file-path}/xorb_<hash>_<range>. The xorb hash is in the filename, but it is not used as the cache index. The cache is keyed by the file path that originated the request. The same xorb that should appear once and be shared across two files — or two repositories, or two revisions of the same model — is instead stored separately for each requesting file. The whole architectural point of Xet (one chunk stored once, served everywhere) is silently discarded.

2. Range-keyed, not xorb-keyed. The on-disk filename is something like xorb_<hash>_0-62796437. That's not a complete xorb; it's a byte slice of the xorb sized to whatever range one client happened to ask for. A different client that requests a different byte range of the same logical xorb produces a different filename and a fresh cache entry. The hash in the filename is decorative — the cache cannot answer the question "do I already have this xorb?", only "do I already have this exact range request from a previous client?".

3. Double storage. Files are cached both as flat files in model/ (the Git/LFS-style cache) and as range-sliced pseudo-xorbs in model/.xet/<filename>/. For a 125 MB .tflite file, we measured roughly 125 MB + 115 MB of xorb-named slices = **240 MB of storage to serve 125 MB of model**. And the Xet path is not even faster on warm reads — repeated identical requests showed ~5 s each, indicating every call re-validates upstream rather than serving from a true content-addressed local store.

The verdict: Artifactory's Xet implementation mimics the API surface (/reconstructions, xet-read-token, xorbs/) and mimics the on-disk naming (xorb_<hash>_<range>), but underneath it's a per-file proxy where each chunk is an isolated blob tagged with the path of the originating file. You get none of Xet's actual benefits — no global deduplication, no chunk reuse across files or repos, no warm-cache speedup — and you pay roughly double the storage cost on top.

A subtler point: where the chunks actually live

Even when the implementation is well-behaved, there's an architectural detail enterprise architects should understand. In the Xet protocol, the chunk data lives on Hugging Face's CAS service, not in Artifactory. When hf_xet resolves a Xet-enabled file, the client requests a short-lived Xet access token from the Hub, then talks directly to cas-server.xethub.hf.co for the actual chunk data. Artifactory proxies the auth and metadata flows; the heavy data flow goes Hugging Face → client.

This is fine when it works. It is not a path you fully control end-to-end, and it depends on the Hub identity configured on the Artifactory remote having sufficient rate-limit budget — which loops back to the 429 problem above. We've also seen Xet client compatibility issues — Artifactory had to ship a fix for the /v1/ CAS API path prefix introduced in hf_xet 1.3.0+. Expect this kind of compatibility maintenance to be ongoing.

Practical recommendation for current Artifactory users

If you're running Artifactory with Hugging Face today and storage cost matters, the practical recommendation is to disable Xet on the client side:

export HF_HUB_DISABLE_XET=1

This forces huggingface_hub to fall back to the standard LFS-style download path, which Artifactory caches efficiently as a single flat file. You give up nothing meaningful — Artifactory's Xet path doesn't deliver Xet's benefits anyway — and you cut your storage footprint roughly in half. For on-prem deployments where storage is a real cost line, this matters.

If you want true Xet semantics — global deduplication across your entire model fleet, content-addressed chunks served from a local store at LAN speed — that requires a Hugging Face–native infrastructure layer. That's Model Gateway, discussed in Part 2.

The June 2026 forced migration to the Machine Learning repository layout

If you have any Hugging Face repository in Artifactory that was created before Artifactory 7.111.1, it must be migrated by June 2026 to the new Machine Learning repository layout. After that date, the legacy Hugging Face layout is deprecated and full functionality is no longer guaranteed.

What changes

The repository's internal storage layout converts from the legacy structure (mirroring Hugging Face Hub) to the new ML structure (format-agnostic, FrogML-aware, with model-manifest.json per model).
Xet protocol becomes available on remote repositories that have migrated.
Virtual Hugging Face repositories become possible — they require all backing repos to be on the new layout.
The new layout supports a richer set of ML package types beyond just Hugging Face.

What you should know before you migrate

The migration is one-way for practical purposes. JFrog provides a restore_layout REST API for emergencies, but the documentation explicitly warns: "if you migrate to the new layout and then add packages to the repository, if you choose to restore the old layout the newly added packages will be deleted."
Federated and replicated repositories cannot mix layouts. Machine Learning repositories cannot federate with legacy Hugging Face repositories — the layouts are not compatible. For multi-site enterprises, this means coordinating a synchronous cutover: every topology member must be on Artifactory 7.111.x or later, replication and federation must be paused with empty queues, and every instance must be migrated before federation resumes.
The migration window will see HTTP traffic spikes. Cache-warming, re-indexing, and re-validation against huggingface.co will increase your Hub Resolve and API request volume during the transition. This is exactly when rate limits bite hardest.

What to do now

If you're a large enterprise running JFrog Artifactory with Hugging Face repositories, you should be planning the legacy → Machine Learning layout migration now (early 2026 at the latest), with:

An inventory of every Hugging Face remote, local, and federated repo.
A clear identity strategy for the Hub-side traffic that will spike during cutover (this is the moment to upgrade to Hugging Face Enterprise or Enterprise Plus if you haven't).
A coordinated cutover plan for any federated/replicated topology.
A rollback plan that accepts the constraint that post-migration writes will be lost on rollback.

Part 2 — Why Hugging Face Enterprise Plus is the right tier for proxy environments

Artifactory is excellent at being an artifact manager. It is not Hugging Face. The features that matter most for an enterprise running a sustained Hugging Face workload — high rate limits, organizational identity, audit logs, governed model access, gated-model permission management — live on the Hugging Face side, and most of them require a paid Hugging Face plan.

Higher rate limits, by design

Hugging Face's published rate limits scale by plan. Free and PRO accounts get the lowest tiers. Team and Enterprise organizations get progressively higher buckets across all three categories (API, Resolve, special actions). According to the Hugging Face Network Security documentation:

"Enterprise Plus automatically gives your users the highest rate limits possible for every action."

If your Artifactory proxy hits 429 errors on a regular basis — and most enterprises with serious AI adoption do — this is the single most direct fix. Issue the Artifactory remote repository's upstream token from a Hugging Face Enterprise Plus organization, and your throughput problems go away.

There's an additional Enterprise Plus feature worth knowing about: Higher Hub Rate Limits with IP allowlisting. If your organization defines its corporate outbound IP ranges in Hugging Face's Network Security settings, you unlock the highest HTTP rate limits available. This is exactly the pattern an Artifactory deployment with controlled egress fits into.

Organizational identity that an enterprise can actually manage

Beyond rate limits, Hugging Face Enterprise and Enterprise Plus replace the personal-token failure mode with proper organizational identity:

Single Sign-On (SSO) — Enterprise plans support SAML and OIDC. Enterprise Plus adds Managed SSO, where your IdP becomes the only authentication path; users cannot create personal Hugging Face accounts that bypass it.
SCIM provisioning — automatic user lifecycle management. Enterprise Plus's SCIM manages account creation, profile updates, and deactivation; when an employee leaves, their access disappears automatically.
Token approval policies — admins can require approval before any new fine-grained token can be used against the organization.
Audit Logs — every membership change, repository action, token approval, and resource group change is logged with attribution. Combined with Artifactory's own audit trail, this completes the compliance picture.
Resource Groups — fine-grained, per-team scoping of which repos a token or service account can reach.
Storage Regions — for EU data residency under GDPR.

For an Artifactory proxy specifically, the practical pattern is: create a service account inside an Enterprise organization, give it a fine-grained token scoped to "read public Hub plus your private org repos," subject the token to admin approval, and use that token as the upstream identity on Artifactory's Hugging Face remote repository. Every gap from Part 1 — rotating keys when admins leave, no audit trail, no scope, no rate limit — closes at once.

Model Gateway: the Hugging Face–native way to manage internal Hugging Face models

This is the feature most directly relevant to organizations that today are using Artifactory primarily to cache and govern Hugging Face models — and it's where Hugging Face Enterprise Plus has something Artifactory cannot replicate.

Model Gateway is a feature in preview for select Enterprise Plus organizations. It is, in essence, Hugging Face's own internal model registry, designed for exactly the use case enterprises currently bend Artifactory to fit:

Model caching in your local company storage, with true Xet semantics. Once a model is fetched through Model Gateway, it lives in your registry and is no longer dependent on the upstream remaining available. If a model author deletes a repo on huggingface.co, you still have it. Crucially — and unlike Artifactory's Xet implementation described in Part 1 — Model Gateway is a real content-addressed store. Chunks are deduplicated globally across your entire model fleet, not stored separately per file or per request. A 70 GB Llama variant that shares 95% of its weights with another fine-tune costs you roughly 5% additional storage, not another 70 GB. This is the point of Xet, delivered in the only place architecturally suited to deliver it: a Hugging Face–native infrastructure layer.
Versioning preserved. Specific model revisions are pinned and addressable, the same way they are on the Hub.
Gated-model permission management at the organization level. This is the killer feature. Models like Meta's Llama, Google's Gemma, and Mistral's models are gated — each user must individually accept terms before downloading. With Model Gateway, permission is requested once at the organization level, and the model becomes available to every employee through the local registry. No more 50 engineers individually accepting the Llama license. No more "I can't pull Gemma, can someone help me get permission?"
Fast downloads from local storage. Once cached, models serve from your infrastructure at LAN speeds — and because the storage is content-addressed, common chunks are already warm from the first download.
Audit logs attributing model downloads to specific employees. The compliance story your security team has been asking for.
Direct expert support. Enterprise Plus members get a private Slack channel staffed by Hugging Face Solutions Engineering.

Artifactory's Hugging Face remote repository does some of these things imperfectly — caching (without dedup), partial governance via JFrog Curation, audit trail on the Artifactory side. But it cannot do organization-level gated-model permission propagation, because that's a property of the Hugging Face identity layer. It cannot deliver true Xet deduplication, because that requires being a Hugging Face–native CAS service rather than a thin proxy with Xet-shaped naming. And the audit story is only complete when the Hub-side actions are also logged — which requires an Enterprise Audit Logs subscription.

Network Security on Enterprise Plus also lets you precisely control which models, datasets, organizations, and applications on the public Hub are accessible to your employees — useful for enforcing model governance rules (e.g., "no models from organization X," "no datasets above license tier Y") at the network layer rather than relying solely on Artifactory Curation.

When to use what: JFrog Artifactory vs. Hugging Face Enterprise Plus

The honest framing for IT and security architects is this:

Need	Right primitive
Cache Hugging Face models alongside Docker, npm, Maven, etc. in one universal artifact manager	JFrog Artifactory remote repo
Scan models for malicious content as part of broader software supply chain	JFrog Xray + Curation
Bundle models into immutable release artifacts with the rest of your software	JFrog Release Bundles
Survive Hugging Face Hub rate limits at enterprise scale	HF Enterprise / Enterprise Plus
Manage gated-model access (Llama, Gemma, Mistral) for hundreds of employees	HF Enterprise Plus + Model Gateway
SSO, SCIM, fine-grained tokens, audit logs on the Hub side	HF Enterprise / Enterprise Plus
EU data residency for HF-hosted artifacts	HF Enterprise Storage Regions
Internal model registry with true Xet content-addressed deduplication	HF Enterprise Plus + Model Gateway
Internal model registry that's native to the Hugging Face ecosystem	HF Enterprise Plus + Model Gateway

These are complements, not substitutes. The most robust architecture for a regulated enterprise is JFrog Artifactory as the universal artifact perimeter, with a Hugging Face Enterprise Plus organization providing the identity and governance layer for the upstream traffic — and Model Gateway providing the internal Hugging Face–native registry that today's Artifactory remote repository imperfectly approximates.

Practical checklist for IT and security teams

If you're an IT, security, or ML platform owner reading this and thinking about your own setup, here's what to do this quarter:

Inventory every Hugging Face remote repository in your Artifactory deployment and confirm the upstream token's identity. If it's a personal free-tier account, that's your highest-priority fix.
Plan the legacy → Machine Learning repository layout migration before June 2026. For multi-site federated deployments, pencil in a coordinated cutover window now, well before the deadline.
Audit your rate-limit posture. Look at your Hugging Face billing page rate-limit gauges (visible to organization admins). If you regularly see red bars, you're already feeling the constraint.
Talk to Hugging Face about Enterprise or Enterprise Plus. If you're a Fortune 1000 company running serious AI workloads, you almost certainly want at least Enterprise. If you have gated-model usage, regulatory requirements, or both, Enterprise Plus and Model Gateway are the right fit.
Document your Hub-side identity strategy alongside your Artifactory operations runbook. The two are coupled in ways that aren't obvious until something breaks at 2 a.m.

JFrog Artifactory is a powerful piece of infrastructure, and for many enterprises it's the right place to land Hugging Face content. But the most resilient architecture — and the one that scales as your AI program grows — pairs it with a properly-tiered Hugging Face organization. Get the identity, rate limits, and governance right at both layers, and your developers will simply stop noticing the proxy.

FAQ

How do I configure huggingface_hub to use JFrog Artifactory as a proxy? Set HF_ENDPOINT to your Artifactory Hugging Face repository URL (https://<artifactory>/artifactory/api/huggingfaceml/<repo-name>) and HF_TOKEN to a valid Artifactory token. The standard huggingface_hub, transformers, datasets, and diffusers libraries will then resolve through Artifactory.

Why am I getting HTTP 429 errors when downloading Hugging Face models through Artifactory? Because the Hugging Face identity configured on your Artifactory remote repository has hit its rate limit. All your developers and CI jobs share that single identity at the Hub upstream. The fix is to issue the upstream token from a Hugging Face Enterprise or Enterprise Plus organization — Enterprise Plus provides the highest rate limits available and supports IP allowlisting for further uplift.

What happens to my legacy Hugging Face repositories in Artifactory after June 2026? The legacy Hugging Face repository layout is deprecated as of June 2026. Repositories not migrated to the new Machine Learning repository layout will lose full functionality. Migration is initiated from the Artifactory Administration UI ("Upgrade Repository Layout") and is one-way for practical purposes. Federated and replicated topologies require coordinated migration with replication paused.

What's the difference between the legacy Hugging Face package type and the new Machine Learning package type in Artifactory? The legacy package type mirrors Hugging Face Hub structure and works only with Hugging Face content. The new Machine Learning package type is format-agnostic — it stores Hugging Face content alongside PyTorch, ONNX, .pkl, and other ML formats — and is the only layout that supports Xet protocol, virtual repositories, and the FrogML SDK.

Does Xet protocol work with JFrog Artifactory? Technically yes, but with a critical caveat. Xet is supported only on the new Machine Learning repository layout, only on remote repositories, and not when download redirection is enabled. More importantly, Artifactory's Xet implementation is a thin per-file proxy that mimics the Xet API surface and on-disk naming but does not provide global deduplication — chunks are stored separately per file path and per requested byte range, leading to roughly 2× the storage you'd expect for a content-addressed system. For most on-prem Artifactory deployments, the practical recommendation is to disable Xet on the client (HF_HUB_DISABLE_XET=1) and use the standard LFS download path. True content-addressed deduplication requires Hugging Face Model Gateway.

My Artifactory storage is growing faster than I expected for Hugging Face models. Why? If you have Xet enabled on Artifactory, you're likely seeing the double-storage effect: each model file is cached as both a flat file and as a series of pseudo-xorb byte-range slices, neither of which deduplicates with the other or with chunks from other models. Setting HF_HUB_DISABLE_XET=1 on your clients forces the standard download path and roughly halves the cache footprint. For true deduplication across your model fleet, see Model Gateway.

Can I use a personal Hugging Face access token on my Artifactory remote repository? Technically yes; in practice, this is a common anti-pattern. Personal tokens have low rate limits, no audit trail, no fine-grained scope, and create a single point of failure tied to one employee. Use a token issued from a Hugging Face organization on Team, Enterprise, or Enterprise Plus instead.

What is Hugging Face Model Gateway, and how does it relate to Artifactory? Model Gateway is a Hugging Face Enterprise Plus feature (in preview) that provides a Hugging Face–native internal model registry. It caches models in your company's local storage, manages gated-model permissions at the organization level (one acceptance covers all employees), and provides audit logs attributing downloads to individual employees. It complements Artifactory rather than replacing it: Artifactory remains the universal artifact perimeter; Model Gateway provides a deeper Hugging Face–native experience for ML teams.

How do I get started with Hugging Face Enterprise or Enterprise Plus? Visit huggingface.co/enterprise or contact api-enterprise@huggingface.co. Hugging Face works async by email — there's no sales call required.

Learn more about Enterprise and Enterprise Plus at huggingface.co/enterprise.*

Optimum Intel 2.0: An OpenVINO-First Toolkit for Running Open Models on Intel

June 11, 2026

How to Comply with SOC 2 and ISO 27001 with Hugging Face: A Practical Guide to AI Model Supply Chain Governance

May 14, 2026

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote