Products - Xference

BETA

Standard

VIRTUAL PRIVATE INFRASTRUCTURE

Ideal for those who want to start using generative AI with awareness of their privacy protection. On your server, you can perform inference securely. All your data will remain stored within your private server.

Included in this plan:

2 MB max file size
100 MB file upload
1 M token / month
Unlimited prompt
1 user

BETA

Pro

VIRTUAL PRIVATE INFRASTRUCTURE

Designed for users already leveraging generative AI platforms like GPT who now demand full data protection whether for work or personal use.

Included in this plan:

10 MB max file size
1 GB file upload
3 M token / month
Unlimited prompt
1 user
Plug-in with your mail, messaging, storage and database server

BETA

Team

VIRTUAL PRIVATE INFRASTRUCTURE

Built for professionals and enterprises running large-scale inference workloads and demand uncompromising data protection.

Included in this plan:

20 MB max file size
3 GB file upload
10 M token / month
Unlimited prompt
Up to 3 users included, then pay per user
Plug-in with your mail, messaging, storage and database server

Enterprise

HOUSING OR ON-PREMISES

For large organizations that need to manage data sovereignty, choose between hosting your server in your data center, or deploying a physical server in Xference's datacenter, ensuring your data stays fully isolated and under your control.

Included in this plan:

Unlimited file upload
Unlimited token / month
Unlimited prompt
Unlimited users
Unlimited API connections

BETA

API

DEDICATED SERVER SOLUTIONS

For developers and teams who rely on AI APIs but demand absolute data privacy, Xference's Custom API delivers the power of generative AI without the risks. Built for experts, our API enables seamless integration with your existing workflows, while ensuring all prompts, responses, and data are processed entirely within your private infrastructure. No data leaves your environment. No third-party exposure. Just high-performance, secure inference, tailored to your exact needs.

Included in this plan:

API-first design: Easy integration with existing systems
No model selection needed: Xference optimizes the best LLM for your use case
Enterprise-grade security: End-to-end encryption and full control

	Standard	Pro	Team	Enterprise
Max file size	2 MB	10 MB	20 MB	Unlimited
Max file upload	100 MB	1 GB	3 GB	Unlimited
Token / month	1 M	10 M	15 M	Unlimited
Prompt limit	Unlimited	Unlimited	Unlimited	Unlimited
# of users	1	1	Up to 3 included, then pay per user	Unlimited
API connections	x	With your mail, messaging and database server	With your mail, messaging and database server	Unlimited
Customer support	x	Email	Email	Custom

Other common questions

What are tokens?

Tokens are small parts that text is split into when processed by AI. Think of them like words or pieces of words. For example, the sentence "Hello, how are you?" might be split into 5 tokens: "Hello", ",", "how", "are", "you?".

AI systems count tokens to measure how much text you're sending and how much they're processing. The more tokens, the more computing power and time it takes.

In Xference, all token usage is tracked to your account, and your data stays private—never shared or stored outside your server.

What models does Xference support?

Xference supports multiple open-source LLM models. The best models are tested and selected based on performance, speed, accuracy, and resource efficiency. You don't need to choose, Xference selects the optimal model for your use case.

Can I choose the GPU hardware?

No. Xference's infrastructure is already optimized with the best GPUs for inference. You don't need to worry about hardware configuration, everything is handled for you.

Can I install Xference on my own servers?

Yes. Xference can currently be installed on dedicated servers chosen together with the client, in either an on-premise setup or in a hosting facility.

Can I share my account with others?

No. All chats and token usage are linked to your account. Sharing access is not supported to ensure data privacy and security.

How is it possible that everything is unlimited in the Enterprise plan?

Because the only limit is the hardware configuration, which is defined together with the client to meet all their requirements.

Private AI Inference Platform: Powering Intelligent Solutions with Security and Performance

Standard

Pro

Team

Enterprise

API

Not sure which plan to choose? Take a look at the comparison chart below, to help you decide.

Other common questions

What are tokens?

What models does Xference support?

Can I choose the GPU hardware?

Can I install Xference on my own servers?

Can I share my account with others?

How is it possible that everything is unlimited in the Enterprise plan?

Still not sure? Don't hesitate to reach out and talk to a human.

Private AI Inference Platform: Powering Intelligent Solutions with Security and Performance

Standard

Pro

Team

Enterprise

API

Not sure which plan to choose? Take a look at the comparison chart below, to help you decide.

Other common questions

What are tokens?

What models does Xference support?

Can I choose the GPU hardware?

Can I install Xference on my own servers?

Can I share my account with others?

How is it possible that everything is unlimited in the Enterprise plan?

Still not sure? Don't hesitate to reach out and talk to a human.

Cookie Preferences