ReBAC Meets BSS: Why TMF 672 and OpenFGA Belong Together

Part 2 of 4 — The gap, nobody is designing for.

Second in a four-part series: ReBAC Meets BSS, A Practitioner’s Blueprint for AI-Native Role-Based Access in Telco

Last week (Part 1) I described three anti-patterns that I believe are taking shape quietly inside enterprise AI deployments: the over-permissioned agent, the under-permissioned agent, and the stale-permission agent. I argued that all three share a common structural cause: the gap between when a party’s role changes and when the authorization systems acting on behalf of that party reflect that change.

Before introducing the two technical primitives I believe belong at the foundation of a proper solution, I want to ground that abstract gap in something concrete. Because the authorization gap is not a future problem waiting to emerge when AI agents mature. It is a present operational reality that AI agents will inherit, and amplify.

The question worth sitting with is this: how many ways can the authorization state of a single party legitimately change in one day inside a Tier-1 telco operation?

The answer is more than most authorization architectures are designed to handle in real time.

The Many Faces of Authorization Change

Authorization state does not change through a single well-defined mechanism. It changes through a continuous stream of business events, each originating in a different system, each carrying different urgency, and each requiring a different set of downstream permission updates. Here are five of the most consequential.

The AI personal assistant whose mandate just ended

A consumer grants an AI personal assistant permission to manage their account. View bills, raise fault reports, modify add-on services. The consent is explicit, scoped, and recorded in a consent management system. The assistant begins operating on their behalf.

Three weeks later the customer withdraws consent. Perhaps they are uncomfortable with a recent interaction. Perhaps they simply changed their mind. The withdrawal is recorded.

How quickly does that withdrawal propagate to every system the assistant has been operating in? The billing system. The fault management platform. The product catalogue. The CRM. The network management layer. Each of these systems granted the assistant access when consent was established. Each needs to revoke that access when consent is withdrawn.

In most current implementations the answer is: not immediately. The withdrawal triggers a propagation process that touches each downstream system in sequence. The assistant may continue operating with full account permissions for hours or days after the customer believed they had withdrawn access. This is not a hypothetical future risk. Consent management APIs exist today. AI personal assistants are being deployed today. The propagation lag is a present architectural gap.

The credit limit that changed everything at midnight

A consumer hits their credit limit at 11:47pm. Their service is automatically restricted by the billing system. The restriction is a business event with immediate authorization implications: every system acting on behalf of this customer needs to operate within the restricted permission set from this moment forward.

The AI customer service agent handling their next interaction at 8am the following morning was initialized with the pre-restriction permission context. It offers the customer options they are no longer entitled to. Not because the AI reasoned incorrectly. Because the authorization context it was given reflects a financial state that no longer exists.

The partner that was offboarded on Friday afternoon

A reseller partner relationship is terminated. The offboarding is processed in the partner management system on a Friday afternoon. Across the B2B2C landscape that partner has been operating in, they hold permissions across a billing platform, a service management portal, a product catalogue, a customer data system, and a network configuration tool.

Revoking those permissions requires coordinated action across five teams managing five systems. By Monday morning, three of the five have been updated. Two have not. The partner’s access to customer data persists through the weekend. An AI agent operating in the partner portal context continues to honour permissions that should no longer exist.

The customer who exercised their right to erasure

A customer submits a GDPR right to erasure request. The request is acknowledged and logged. The erasure process begins. It touches the CRM, the billing history, the fault management records, the marketing preferences system, and the analytics platform.

While the erasure process is running, an AI personalisation agent is still operating on that customer’s profile. It has not been told the profile is in the process of being erased. Its authorization context has not been updated to reflect the regulatory event that is actively changing the customer’s data status. It continues to make personalisation decisions based on data that is legally in the process of ceasing to exist.

The trouble ticket that never closed properly

A field engineer raises a trouble ticket to diagnose a connectivity fault at a customer premises. The ticket triggers a temporary permission elevation scoped to the customer’s network segment. Standard procedure. The fault is resolved. The ticket is closed. The permission revocation is a separate manual step in a separate system. In the volume of daily operations, the step is missed.

Three weeks later the engineer still holds access they no longer need. The authorization system reflects a reality that ceased to exist when the ticket closed. Nobody knows. Nothing alerts. The gap is invisible until an access review surfaces it, which may be months away.

These five scenarios share the same structural characteristic. A legitimate business event changes the authorization reality of a party. The authorization layer does not reflect that change in a timely or reliable way. The gap is filled, imperfectly, by manual processes that depend on human memory, correctly configured workflows, and downstream systems being updated in the right sequence.

For human-operated systems, this lag has always been a compliance risk managed through periodic access reviews. It is tolerable, just about, because human operators carry contextual awareness that partially compensates for the authorization layer’s staleness. A human support agent knows their temporary elevation is scoped to a specific incident. They apply judgment. They do not use elevated access for tasks outside that scope.

An AI agent operating with the same temporarily elevated permission carries no such contextual awareness unless it is explicitly encoded in the authorization layer. The agent does not know the permission is temporary. It does not know it is scoped to a specific incident. It will use whatever permissions it has been given for whatever tasks it is asked to perform, because that is precisely what it was designed to do.

This is not a failure of the AI model. It is an architectural gap. The permission context the agent receives does not carry the metadata that a human operator would naturally apply. Closing that gap requires the authorization layer to encode not just what is permitted, but the conditions and constraints under which it is permitted, and to keep that encoding current as the operational reality around it evolves.

A Brief and Entirely Plausible Horror Story

Consider, purely as a thought experiment, the following sequence of events.

A field engineer raises a trouble ticket to diagnose a connectivity fault at a customer premises. The ticket triggers a temporary permission elevation scoped to the customer’s network segment. Standard procedure. The ticket is resolved. The permission is not revoked. Also standard procedure, regrettably.

Three months later a firmware update is pushed to the customer’s estate. The update touches a smart building management controller. The controller, now operating with a service account that inherited a fragment of the unrevoked elevated permission through a poorly scoped role propagation, begins reporting environmental telemetry to a network management endpoint it has no business accessing.

The network management AI agent, doing exactly what it was designed to do, notices the telemetry, infers a device is requesting a configuration update, and helpfully provisions the controller with expanded network visibility.

The controller is now, functionally, a network administrator. It did not ask to be. Nobody authorized it to be. A trouble ticket from three months ago and a firmware update from last week conspired to promote it.

This is not science fiction. Every step in that sequence is a documented pattern in operational technology environments. The IoT device did not attack anything. It simply existed in an environment where the authorization layer had not kept pace with the operational reality around it. The AI agent did not malfunction. It acted entirely rationally on the permissions it was given.

The hair dryer that wins a weatherman’s bet does not need to understand meteorology. It just needs to be plugged in at the right moment.

A word on the choice of TMF 672. The telecommunications industry has spent decades curating a deceptively simple model for something genuinely complex: how a party acquires, holds, and loses roles and the permissions those roles carry. That model, expressed in TMF 672, is not just a telco artefact. Party, role, and permission are universal authorization concepts. TMF 672 gives them a clean industry-validated shape that makes the real problems in authorization easier to see, easier to name, and easier to design against. That is why it anchors this series.

TMF 672: The Authorization Event Stream Nobody Is Using as One

Most architects who work with TM Forum Open APIs treat TMF 672, the Party Role Management API, as exactly what its name suggests: an API for managing the lifecycle of roles assigned to parties across a BSS landscape. Create a role. Update a role. Query a role. Delete a role. Standard operations on a standard resource.

I want to propose a different way of reading it.

Every TMF 672 role transition event is an authorization intent signal. When a consumer becomes a business account holder, that is not just a record update in a CRM. It is a statement that the downstream authorization state of every system acting on behalf of that party needs to change. When a reseller partner is downgraded, that is not just a tier change in a billing system. It is a cascading instruction to revoke, modify, and constrain permissions across every touchpoint that party interacts with. When an AI personal assistant’s consent is withdrawn, the consent event is an authorization revocation instruction directed at every system that assistant has been operating in.

TMF 672 is already in production in most Tier-1 telco operators. The authoritative source of party role truth exists. The event stream that signals when authorization state needs to change exists. What does not exist, in most implementations, is the intelligence to act on those signals in a timely, accurate, and auditable way.

Instead, what exists is the operational reality described above. Manual processes. Trouble tickets. Permission propagation that depends on human memory and correctly sequenced workflows across disconnected systems.

The gap is not a data gap. The data is there. It is a translation gap. Between the business event that TMF 672 records and the authorization state that downstream systems, and increasingly AI agents, act upon, there is a translation process that is currently slow, manual, and error-prone.

That translation process is what this series is ultimately about. But before introducing the translation layer, I want to introduce the authorization primitive that sits on the receiving end of it.

OpenFGA: Authorization as a Relationship Graph

Traditional Role-Based Access Control works well for straightforward permission models. A user has a role. The role has permissions. The mapping is static and relatively flat.

Telco authorization is not straightforward. It is not flat. And as the scenarios in this article illustrate, it is not static.

Consider what it means to express, in an authorization system, the following real-world relationship: a reseller partner is permitted to manage service requests on behalf of enterprise customers within their portfolio, but only for products within their contracted tier, only in markets where they hold an active agreement, and only for customers who have explicitly consented to partner-managed support.

A traditional RBAC model struggles with this. The relationship between the reseller, the enterprise customer, the product, the market, and the consent status is not a flat role assignment. It is a graph of relationships, each carrying its own conditions and constraints, each of which can change independently.

OpenFGA, the open-source implementation of Google’s Zanzibar authorization system, models exactly this kind of relationship complexity natively.

The core concept to understand is the relationship tuple. Think of a tuple as the simplest possible statement of a relationship between two things: “this entity has this relationship to that entity.” For example: reseller X manages customer Y. Customer Y owns product Z. Product Z is available in market M. Permissions are not assigned directly in OpenFGA. They are derived by traversing the chain of these relationship statements. A permission check asks: given everything I know about how these entities relate to each other, is this action permitted? OpenFGA traverses the graph and returns a precise, real-time answer. If any relationship in the chain changes, the answer to the permission check changes with it.

This matters for three specific reasons in the context of this series.

First, OpenFGA’s relationship model can express the multi-layered authorization complexity of B2C, B2B, and B2B2C telco contexts with a precision that RBAC cannot. The reseller relationship, the consumer relationship, the consent relationship, and the conditions that govern all three can be encoded in the same graph.

Second, OpenFGA’s tuple store is writable via API. This means an external system, or an AI reasoning layer, can propose and commit authorization changes programmatically. The authorization graph is not a static configuration. It is a dynamic, updateable representation of the current authorization state.

Third, OpenFGA provides real-time relationship checking at low latency. An AI agent can query OpenFGA before taking any action and receive a precise, current answer about what it is permitted to do. Not what it was permitted to do when its session was initialized. What it is permitted to do right now.

These three characteristics, expressive relationship modelling, programmatic writability, and real-time query capability, make OpenFGA the right authorization primitive for the problem this series is addressing.

The Natural Fit, and the Gap That Remains

TMF 672 and OpenFGA belong together because they are addressing the same problem from opposite ends.

TMF 672 knows when authorization state needs to change. It holds the authoritative record of party role transitions across the BSS landscape. Every time a consumer upgrades, a reseller is onboarded, a consent is withdrawn, a credit limit is breached, or a partner agreement expires, TMF 672 records that event.

OpenFGA knows how to represent authorization state with the precision and dynamism that complex telco relationships require. It can model the multi-layered permission structures of B2C, B2B, and B2B2C contexts. It can be updated programmatically. It can answer real-time authorization queries.

What sits between them is the translation problem. A TMF 672 role transition event needs to become a set of OpenFGA tuple mutations. The party that was a consumer needs to have their consumer relationship tuples revoked and their business account holder relationship tuples created. The reseller that was downgraded needs to have their premium product access tuples modified to reflect their new tier. The AI personal assistant whose consent was withdrawn needs to have every tuple that granted it access to the customer’s account systematically revoked.

This translation is not a simple data mapping exercise. The relationship between a TMF 672 role transition and the resulting OpenFGA tuple mutations is context-dependent. It varies by market, by product scope, by regulatory framework, and by the specific history of the party’s role transitions. It requires understanding not just what changed, but what that change means for every downstream relationship in the authorization graph. It is not the kind of translation that a deterministic rules engine handles reliably at the scale and complexity of a Tier-1 telco operation.

Here is where the gap currently sits:

Authorization Translation — The Broken Arrow

The broken arrow in that diagram is not a design choice. It is an accurate representation of how most enterprise authorization architectures currently handle the translation between business role events and authorization state. Manually. Slowly. With the operational risks this article has described.

Next week I will show what belongs in that gap, and why it needs to reason rather than simply route.

A Question for Practitioners

Before next week’s article I want to ask a direct question to anyone working on authorization architecture in enterprise or telco environments.

Which of the five authorization change triggers described here is causing the most operational friction in your current environment? And if you have designed or evaluated solutions to any of these patterns, I would genuinely like to understand what you found.

The comment section is open. So is my inbox.

Soumit Saha is a Digital Platform and Technical Architect with 25 years of experience in telco, cloud, and enterprise integration. He has led the adoption of TM Forum Open APIs across multiple markets and holds TOGAF 9, ODA Practitioner, and AWS Cloud Architect certifications. This series represents his personal architectural thinking and does not reflect the views or systems of any employer.

Next: Article 3, Why the Translation Layer Needs to Reason, Not Just Route: Claude, AWS Bedrock and the Security Boundary Question

Originally published at https://www.linkedin.com.

ReBAC Meets BSS: Why TMF 672 and OpenFGA Belong Together was originally published in System Weakness on Medium, where people are continuing the conversation by highlighting and responding to this story.