Kafka Integration Guide
Xtremepush supports Kafka as a high-throughput data ingestion channel, enabling event streaming and profile synchronisation at scale. This guide covers everything your technical team needs to establish a successful integration.
Overview
Kafka integration allows your backend systems to stream event data and synchronise user profiles with Xtremepush at high volume and low latency. Rather than making individual API calls per event or user update, your infrastructure publishes messages to Kafka topics that Xtremepush consumes continuously.
This guide covers the two integration approaches available, the required topic formats and payload schemas, and guidance on handling high-volume data scenarios.
Before you begin
Before starting your Kafka integration, make sure you have the following in place:
- Kafka credentials — each project is provisioned with a unique username, password, and topic names. Contact Xtremepush support if you have not yet received these.
- Egress IP addresses — you will need to provide the IP addresses your infrastructure will send events from so Xtremepush can allowlist them.
- SCRAM-SHA-512 support — all connections use SCRAM-SHA-512 authentication. Confirm your infrastructure supports this before starting.
- Volume estimates — approximate event volumes for both test and production environments, so Xtremepush can scale workers appropriately.
Architecture & key concepts
Before starting your integration, you'll need to understand how Xtremepush interacts with your Kafka infrastructure.
Topics
Data is organised into topics. Each project has two corresponding topics: events and users. Unique credentials are provisioned per project.
Partitions & scaling
Topics can be separated into partitions, allowing Xtremepush to scale consumers horizontally to handle increased throughput. Messages with the same key are routed to the same partition, ensuring ordering is maintained per user.
Offsets
Each consumer maintains its own record of progress (an offset). If a connection is interrupted, Xtremepush resumes from the last acknowledged offset to ensure no data loss.
Integration approaches
There are two primary routes for establishing a Kafka connection. The chosen route determines the level of configuration required from your technical team.
Route 1: Client push to Xtremepush standard format
This is the most efficient route and requires configuration on the Xtremepush side only. Your infrastructure produces (sends) data directly to the topics hosted on our brokers in our required format.
Each project is provisioned with two topics (events and users) and a unique username and password, provided by Xtremepush support. Note that profile attributes can also be set via an event payload using the user_attributes object, in the same way as the Xtremepush API.
Authentication
Connections are established using SCRAM-SHA-512. Credentials are unique per project and provided by Xtremepush support.
What we need from you
To complete the setup, we require the following from your technical team:
- The egress IP addresses your infrastructure will be sending events from.
- Confirmation that you are able to connect via SCRAM-SHA-512 authentication.
- Expected event volumes — this determines how we scale internal workers. Please provide estimates for both test and production environments.
- Details of any bulk loading requirements — sending large volumes of messages in a short period can cause processing delays. If bulk loading is part of your use case, we need to discuss this upfront. See the note in the Users topic section for more detail.
- Where you operate multiple projects (e.g., different brands or regions), confirmation that you can produce each project's data to a separate set of topics on our side.
Route 2: Partner ecosystem (standard integrations)
If you use one of our core platform partners, the Kafka integration is largely pre-configured.
| Partner | Configuration method |
|---|---|
| GiG | Initial configuration via the Xtremepush Marketplace; additional topics managed by Technical Services |
| OpenBet | Initial configuration via the Xtremepush Marketplace; additional topics managed by Technical Services |
| Bede | Managed via our middleware — provide the necessary partner credentials to begin ingestion |
| EveryMatrix | Managed via our middleware — provide the necessary partner credentials to begin ingestion |
Topic formats & payload requirements
All payloads must be valid JSON. Each project has two topics, each serving a distinct purpose.
Events topic
Use this topic to record events and actions a user performs (e.g. bets, deposits, logins).
- Topic name: Defined as part of the setup
- Message key: Not required, but can be set to a unique event UUID. Xtremepush will deduplicate events within a configurable time window based on this key. To enable deduplication for your project, contact Xtremepush support.
- Message headers: None
- Message body: JSON encoded item
| Property | Type | Required | Description |
|---|---|---|---|
event | String | Yes | The name of the event, e.g. bet, deposit. |
user_id | String | At least one user identifier required | Unique user ID. A new user will be created automatically if none exists. |
customer_id | String | At least one user identifier require | Additional user identifier if available. |
profile_id | String | At least one user identifier require | Xtremepush profile identifier. |
device_id | String | At least one user identifier require | Xtremepush device identifier. |
user_attributes | Object | No | Additional user information. Attributes are added to the user profile automatically during event processing. |
value | Object | No | Event properties specific to the event type. Can contain any number of key-value pairs, including nested arrays or objects. |
timestamp | String | Yes | The original timestamp of the event. |
Schema
{
"event": "some_event",
"user_id": "some_user",
"user_attributes": {
"any_attr": "any_value"
},
"value": {
"any_key": "any_value"
},
"timestamp": "2025-01-01 12:00:00"
}Example
{
"event": "bet",
"user_id": "3e685367-07d5-4d48-93ae-f007ac336605",
"customer_id": 12345,
"user_attributes": {
"customer_tier": "VIP"
},
"value": {
"bet_id": "3e685367-07d5-4d48-93ae-f007ac336605",
"odds": 12.2,
"stake": 100.0
},
"timestamp": "2024-09-01 12:00:00.123"
}Users topic
Use this topic to create new users or update existing user profiles.
Important — sequencing
Profile messages are processed sequentially and large imports may take significant time to complete. If events are sent on the events topic for users whose profiles have not yet been processed, those events will be dropped. Ensure your profile import is complete before sending events for those users, and avoid bulk loading large volumes of profile messages in a short period where possible.
- Topic name: Defined as part of the setup
- Message key: Recommended. Should be the user identifier (e.g.
user_id). Used to maintain ordering within the Kafka stream — all messages with the same key are routed to the same partition, ensuring profile updates are applied in the correct sequence. - Message headers: None
- Message body: JSON encoded item
| Property | Type | Required | Description |
|---|---|---|---|
| user_id | String | Yes | Unique user ID. A new user will be created automatically if none exists. |
| user_attributes | Object | Yes | Information about the user. Attributes are added to the user profile automatically during processing. |
| customer_id | String | No | Additional user identifier if available. |
| timestamp | String | No | The timestamp of the attribute change. Used to ensure the most recent value is saved. Will default to the timestamp of a message in Kafka if absent. |
Schema
{
"user_id": "some_user",
"user_attributes": {
"any_attr": "any_value"
},
"timestamp": "2025-01-01 12:00:00"
}Example
{
"user_id": "3e685367-07d5-4d48-93ae-f007ac336605",
"customer_id": 12345,
"user_attributes": {
"customer_tier": "VIP",
"email": "[email protected]"
},
"timestamp": "2024-09-01 12:00:00.123"
}High-volume domains — aggregation strategy
Where raw event volume is extremely high — such as spin-level gaming or rapid wallet deltas — we recommend aggregating data upstream before publishing to Kafka. This keeps throughput predictable, preserves business value, and helps manage event overage costs.
Session-based aggregates
Capped by time (e.g. ≤10 minutes) or count (e.g. ≤N spins). Each aggregate should include totals for stakes, wins, net, counts, game code, channel, and start/end timestamps.
Rolling window aggregates
Typically 2–5 minute summaries grouped by user or product.
Frequently asked questions
What happens if our connection to Kafka drops mid-stream?
Xtremepush consumers maintain an offset for each topic. If a connection is interrupted, processing resumes from the last acknowledged offset — no data is lost.
Can we use the same credentials across multiple projects?
No. Credentials are unique per project. If you operate multiple projects (e.g. different brands or regions), each project requires its own set of topics and credentials.
What timestamp format should we use?
Timestamps should follow the format YYYY-MM-DD HH:MM:SS or YYYY-MM-DD HH:MM:SS.mmm for millisecond precision, as shown in the schema examples above.
What happens if we send an event for a user who doesn't exist yet?
A new user profile will be created automatically when a user_id is encountered for the first time on the events topic. If you need profile attributes set before events are processed, send the user record via the users topic first.
We have very high event volumes (e.g. spin-level data). How should we handle this?
We recommend aggregating upstream before publishing to Kafka. See the High-volume domains — aggregation strategy section above, and discuss your specific use case with your Xtremepush account team before going live.
Related links
Updated 1 day ago