Can Alice edit this blog post? Can Bob comment on that document? Can product executives modify the draft quarterly report? How do applications answer these questions about who can do what?

Essentially, the question is can user U take action A on resource R? – can Alice (user) edit (action) this blog post (resource)?

You could imagine keeping a grants table in your app that has a row for every granted permission, each grant is a tuple of (user, action, resource), so (alice, edit, this blog post) grants alice permission to edit this post.

When there are lots of users, actions and resources, you might want to group them. Groups of actions might be called roles. Users might be grouped into teams, orgs, or divisions. Resources might be grouped into workspaces or accounts. With groups, you can grant access in bulk: the marketing team can edit all documents in workspace 1.

You might even want groups of groups: the DX team and the SRE team are both Infra teams, and Infra teams have admin access to account 123. Some teams and users have multiple parents: "the principal engineers are on both the Infra team and the DX team" or "Pete is on the Sales team but also reports directly to the CEO".

What if you want dynamic groups? Can users define their own teams, roles, or workspaces? Grouping users is common – any medium-sized organization has some internal data system for organizing people into groups (LDAP, Active Directory, Okta, etc). It's nice when those can be synced into applications.

Some applications support flexible, nested groups of resources (e.g. Google Drive), but most applications suffice with shallow, predefined, coarse groups like "workspace" or "organization". And that can work just fine – take GitHub for example, it does a lot with "organization" and "repository".

User-defined roles seem less common – most applications suffice with predefined, coarse groups of actions. Although, using GitHub as an example again, they have slowly introduced more flexible, fine-grained control over time. In my experience, inflexiblity in roles is a common pain point for both users and product designers.

While pondering how to organize identity and access in atlas9, I've been thinking about whether it can (or should) provide support for flexible, fine-grained, dynamic groups of all three: users, actions, and resources. I have sometimes heard users and product designers wish they had more fine-grained control

  • "I wish I could grant this token only these very specific actions, instead of a general editor role"
  • "I wish I could grant contractors access only to edit alert configurations"
  • "These resources are in account 123 and I can't move them, but I want to grant team X access to only a subset"

So, how do I model nested groups of users, actions, and resources? How do I execute access checks against all that information? How do I organize all that data in Postgres? Is it efficient enough?

ltree

Postgres has a cool, core extension called ltree - hierarchical tree-like data type. There are a lot of great blog posts and websites that talk about hierarchical data and ltree (and alternatives) in better detail than I can write:

ltree provides an ltree data type which efficiently stores a path in a tree-like structure – if Sally is on the Marketing team in the GTM org, you could store that path as gtm.marketing.sally.

CREATE EXTENSION ltree;

CREATE TABLE teams (path ltree, name text);
INSERT INTO teams (path, name) VALUES ('gtm.marketing.sally'::ltree, 'sally');
-- sally is also on the product design team
INSERT INTO teams (path, name) VALUES ('product.design.sally'::ltree, 'sally');
INSERT INTO teams (path, name) VALUES ('product.design.sam'::ltree, 'sam');
INSERT INTO teams (path, name) VALUES ('gtm.marketing.tom'::ltree, 'tom');
INSERT INTO teams (path, name) VALUES ('eng.infra.bob'::ltree, 'bob');
INSERT INTO teams (path, name) VALUES ('eng.infra.rob'::ltree, 'rob');

To get everyone in the GTM org:

> select * from teams where path <@ 'gtm';
        path         | name
---------------------+-------
 gtm.marketing.sally | sally
 gtm.marketing.tom   | tom

And if you wanted to get all the teams Sally is on, you could query for name = 'sally' and then split the paths and determine that Sally is in gtm, gtm.marketing, product, product.design. Imagine that each of those levels could have different access to different resources.

Multiple parents, materialized paths, and modifications

I won't go into too much detail here, but note that in a hierarchy where nodes can have multiple parents (a DAG), you're materializing the full path to all leaf nodes. You can see Sally has rows for both gtm.marketing.sally and product.design.sally. Modifications require extra care to maintain consistency across all rows.

For example, if you remove Sally entirely, you have to find all rows related to Sally. If you remove Sally from the product design team, you have to remove only that path. If you were ever to move the design team to the ux org, you'd have to rewrite the relevant rows so that product.design.sally becomes ux.design.sally and product.design.sam becomes ux.design.sam.

If you had a large company, some changes could update a lot of rows. Even without multiple parents, if you were modeling a large filesystem for example, moving a directory could require rewriting a lot of paths.

So, ltree is great for querying hierarchical data, and in some cases, less great for updating large subtrees (e.g. compared to an adjacency list).

Directory, groups, grants, and roles

In the current prototype, resources are organized in a filesystem fashion, and backed by a table called directory, which keeps a path for all resources.

> select * from directory;

 id  |         path         |    name
-----+----------------------+-------------
 bp1 | posts.gtm.marketing  | Blog Post 1
 bp2 | posts.product.design | Blog Post 2

Currently, resources are not allowed to have multiple parents, because I'm not sure yet how that would relate to access control and other filesystem operations.

Roles are stored in the database, and each role is an array of actions. Roles have no hierarchy:

> select * from roles;

   id   |   actions
--------+-------------
 editor | {view,edit}
 viewer | {view}

Groups are stored in hierarchy using ltree paths:

> select * from groups;

      path      | name
----------------+-------
 gtm.marketing  | bob
 gtm.marketing  | sally
 product.design | sam

And finally, the grants table connects all three. Each row grants a principal (user or group) a role (list of actions) on a resource (directory node, could be a "folder" or a resource):

> select * from grants;

   principal   |  role  | resource
---------------+--------+----------
 gtm.marketing | editor | bp1

Access checks use all this information to make a decision:

  1. When a request comes in, the user is loaded from the session.
  2. The user's groups are loaded.
  3. The directory tree for the requested resource is loaded.
  4. Grants related to both the resource's directory tree and the user's groups are loaded.
  5. Roles related to the relevant grants are loaded.
  6. An authorizer checks whether the user or any of their groups has the requested action on the requested resource or any of its parent directories.

For example:

  • Bob wants to edit blog post 1: action: edit, resource: bp1
  • Blog post 1 is stored in the posts.gtm.marketing directory
  • Bob on the marketing team: user.groups: gtm, gtm.marketing
  • The marketing team has been granted the editor role on the marketing posts directory: grants: (marketing, editor, posts.gtm.marketing)
  • The editor role includes the edit action: editor: view, edit, etc
  • The authorizer gets the request (bob, edit, bp1)
  • The authorizer pulls all the information together to see that Bob in on the Marketing team, which has been granted the editor role on a parent directory of the blog post, and the role contains the edit action. All good.

Cedar

I wrote about Cedar previously, and I'm still interested in using it to capture access rules. In this design, Cedar is the "authorizer" – the application queries all the relevant information (user, groups, grants, directory tree, etc), which it passes off to Cedar to make the final decision.

I spent some time wondering whether Cedar needs to play a role at all. After all, the application and the structure of the information does a lot of the work. You could imagine an authorizer that just flattens all the actions from the relevant grants+roles and searches for the requested action. In fact, that's roughly how most homegrown auth code I've seen works (although with more hard-coded information and less coming from the database). And in the current design, the authorizer is an interface, so the implementation could be swapped out to do just that.

I think Cedar might still be useful though. There are other rules which fall outside of the structure laid out above, rules like "the owner of a resource has all permissions" or "viewers cannot view draft documents" or "anonymous users can view published posts", etc.

Those rules can be defined in Cedar and linked to the application using Cedar's policy templates:

permit(
        principal, 
        action in BlogPost::allActions, 
        resource is BlogPost
)
when { principal == resource.owner }

permit(
        principal in ?principal,
        action == BlogPost::view,
        resource in ?resource
)
unless { resource.draft }

permit(
        principal == User::anonymous,
        action == BlogPost::view,
        resource is BlogPost
)
when { resource.published }

But, those rules could also just be encoded in the application itself. I'm not entirely sure yet whether it will be worthwhile to involve Cedar. Writing these rules in Cedar requires loading the relevant attributes anyway (e.g. loading published and draft into Cedar for processing), so at this time, Cedar feels like an unnecessary indirection. More experimentation is needed, and I need to build some real apps with this to see whether it adds values in the long run.

Onward

I've been trying hard to make this identity and access control foundation in atlas9 robust and flexible, because it underlies everything else – all operations need to do lots of different access checks to keep data safe, and features and users (I think) want the flexibility to organize and control access in ways that make sense to them, in ways that suit their specific needs.

I hope this design isn't overly flexible to the point of being complicated and/or inefficient. Time to build some features with it and see!