As part of Thali we are trying to enable people to easily build peer to peer apps on mobile devices using PouchDB. A problem we have run into is how to implement ACLs in a way that doesn’t cause developers writing on our platform to tear their hair out. I make a proposal below but I have to admit that I have no idea if it’s right or sensible. But I figure we have to start some place. What do you think? You can share your ideas on this blog or better yet, on Thali's mailing list!
1 A motivating scenario
Imagine we are creating a chat app. I use chat apps a lot because their data exchange patterns are like lots of other apps we are writing and they are easy to understand. In this chat app discussions only exist on devices and are shared typically over local radio, e.g. Bluetooth or Wi-Fi. A chat is private. If someone isn’t invited to the chat then they shouldn’t know about the chat or its contents. Also users can include attachments like photos and videos with their chats. So the size of individual chat messages isn’t necessarily small.
In our scenario we have four users Aviva, Britney, Celine and Doug.
Each user has a public key and all communication (as is always the case in Thali) is over mutual TLS auth.
Our goal is to let the users create multiple “conversations”, invite only the people they want to a conversation and to synch conversations between each other.
2 Creating a conversation
A conversation starts with when a user, in this case Aviva, opens her phone and hits “create new conversation”. She then specifies that she wants to converse with Britany and Celine. So Doug shouldn’t be able to see this conversation.
Conceptually a conversation is just a group of records. We can even imagine (I’m assuming you, dear reader, are reasonably conversant with PouchDB and its key structure) that the keys for all messages in the same conversation have a form like [Conversation ID]-[User Id]-[Unique Message ID].
So when Aviva creates the new conversation she can prove she created it by generating a UUID, digitally signing it with her public key and then using the hash (just to save space) of the digitally signed statement as the conversation ID. She can then create a record with the key Conversation-[Conversation ID] (I call this the creation record) as the declaration of the conversation’s existence. In that document she would put her digital signature and probably a list of who is allowed to be part of the conversation, in this case, Britany and Celine.
The point is that with this root record it’s possible to prove that it was Aviva who created the group.
For the User ID we can either use each user’s public key or a hash of the key if we want to save space.
The unique message ID is any globally unique value. We could use a count but that could allow Aviva to collide with herself if she has multiple devices that aren’t always in synch. We can order messages by including pointers to the messages they are in response to.
3 Synching
Now imagine that Doug walks into the room with Aviva and their phones automatically find each other and start synching. Aviva’s phone has to know that it shouldn’t send any records about the conversation with Britney & Celine to Doug since he isn’t on the ACL list. Similarly if Doug somehow found out about the conversation we must not allow him to synch anything with Aviva about the conversation.
To further complicate things imagine that Britney comes into the room after she had previously synch’d the conversation with Aviva. In the meantime Britney had run into Celine and synch’d with her as well. So as Britney walks into the room with Aviva she not only knows about the conversation but has new messages in it both from herself as well as from Celine. Because Britney is part of the conversation we allow her to forward messages on Celine’s behalf but only in regards to the conversations on which both are members. This is essentially an optimization to allow for state to be spread faster. If we really wanted to we could have everyone sign their entries in order to prevent forgeries but canonicalization in nosql is a nightmare so we’ll leave that for later. For now just assume that we trust everyone in the same conversation to not lie about each other.
So how do we end up with a situation where people are only allowed to see records they are supposed to see and can’t update records they shouldn’t be able to update?
4 Data modeling
How we solve this problem depends heavily on how we structure our databases.
4.1 One database to rule them all
The simplest database model is that everyone’s phone contains a single database called “Conversations” that contains all data about all conversations. So when Aviva and Britany see each other they both do push/pull synchs to each other’s “Conversations” databases and synch up on data. There are some very nice things about this model. It’s easy for developers to understand. It makes it easy to do all sorts of fun queries using keys. It makes creating views easy since the views have global data to work from.
But what happens when Britney wants to put in a post from Celine into Aviva’s database?
In that case we would probably receive a POST request to “conversations/_bulk_docs” and have to reach into the request body to pull out a record that is trying to add a request in Celine’s name to the conversation. So we have to run an ACL check at that point that decides if Britney has permission to add a record in Celine’s name to Aviva’s conversation database as part of a specific conversation.
4.2 One database per user
It is pretty common in PouchDB/CouchDB land to give each user their own DB. But to be fair when people talk about “one database per user” they are usually also user partitioned. Meaning that each user is their own world with little overlap between users. We aren’t. We have many users in a single conversation.
In this model Aviva’s phone would have four databases, one for Aviva, one for Britney, one for Celine and one for Doug. If Aviva wants a list of all conversations then she has to query across all four databases because anyone could have started a conversation. If she wants to see all the messages in a particular conversation she at least has to look across all the databases of all the members of the conversation since any of them could have contributed some data.
None of this is brain surgery but it just means that the developer has to do extra work. They have to understand which databases have to be enumerated and make sure they talk to the right ones. They can’t just throw a “show me all records whose key is Conversation-* to find all the “conversation started records” or [conversation id]-* to find all the records in one conversation.
Instead the developer has to make the same queries but make them across all the databases for all the users who have databases on the device.
But I don’t really see this model simplifying anything for us. Again, let’s take the scenario where Britney wants to forward a message in a conversation from Celine. In that case Britney would need to write to Celine’s database on Aviva’s phone. I don’t see this ACL check as being any simpler than the one above.
4.3 One database per conversation
Another approach is to make each conversation into its own database. Superficially this seems to make ACL checks easier. Either you have access or you don’t. If you do have access then you can pretty much do what you want. And if you don’t have access then you can’t. So if Britney has access to the db containing the conversation that Aviva created then she can put any record she wants in there, for Celine or anyone else. All we check is that she has the right to get to the conversation DB. We don’t really care what she does there.
I do wonder thought, how far that goes? Sure we’ll stop Britney from creating views or anything else that is too expensive in Aviva’s device. But what if Britney wants to change the “conversation-[ConversationID]” record for the conversation and so alter the ACLs? Is that o.k.? In our current free for all model it actually would be o.k. Although it will be interesting to see what happens when she removes Aviva from the ACL in the conversation DB!
But this does beg a question about discovery. If Aviva created the conversation on her phone and she runs into Britney, how did Britney know to ask for the DB? My suspicion is that Britney would issue a GET on _all_dbs on Aviva’s phone and we would have to filter the results to only include DBs that Britney has at least read permissions on. That shouldn’t be too hard.
But how does a new DB get created? In other words let’s again imagine that Aviva has created a new conversation and so created a new DB to hold it. She walks into the room for the first time with Britney and their phones synch. How does Britney’s phone know that it’s o.k. for Aviva to create a new DB?
This is a really common problem in ACL land, it’s the Turtle issue. There is always a bottom Turtle, or in this case a bottom ACL and we haven’t hit it yet. It means somewhere there is another DB that says who can create new conversations. That DB also probably has some kind of validation function. For example, they will want to check that the “conversation-[ConversationID]” record’s digital signature matches the DB’s name. But honestly that doesn’t seem like the end of the world. In fact we could standardize on the “conversation-[ConversationID]” record and use it to automatically provision ACLs.
Note btw that this doesn’t make all our problems go away. Imagine we have a conversation board structure where there are multiple boards and each board can have multiple conversations. Now we have to structure database names to encode which board and conversation they are part of and we have to have multiple levels of ACLs. We need ACLs for who can create boards and ACLs for who can create conversations within a board.
5 So how do we choose?
We are choosing if we want to enforce ACLs based on database + method + URL + record key or based on database + method + URL. The main difference in implementation terms being that the later mode, taken from one database per conversation, would enable us to not have to look inside method request bodies for bulk methods like _bulk_docs to pull out individual record keys. From a developer’s perspective there really isn’t that big a difference. They are still making basically the same decision based on the same data. It’s just that some of the data, like conversation identifier, migrates from the record key to the database name. But it’s still the same data and it’s still the same decision.
Honestly I’d strongly prefer to let developers just have one big database. This makes everything from queries to views much easier to do and do right. So my vote would be for using database + method + resource + record key. Yes, this means we have to go digging inside of _bulk_docs but that isn’t brain surgery. It also means we will need filters on _all_docs (or maybe views or maybe filters on views?) but that isn’t brain surgery either.
6 So how does one program ACLs in this model anyway?
From a developer’s perspective they would need to do a few things. First, they would need to provide an ACL update function. This function would most likely be a view on a database. Inside that view would be two types of records, ACL entries and Group Declarations.
So in our example when Britney added Aviva to her app this would cause records to be created/updated in the view stating that:
Key ACLEntry-CreateRecord
Value { “create”: [“GUID1”]}
Key GroupEntry-GUID1
Value { “friendlyName”: “friends”, “members”: [“Aviva’s public key hash”]}
Then when Aviva later tries to send a “conversation-[ConversationID]” record to create a new conversation Thali would call a developer supplied function passing in (“POST”, “conversations/_bulk_docs”, “Conversation-1234567”, documentBody). The developer supplied code would then determine that this was a request to create a new conversation, it could even use the digital signature in the document body to decide if the request ID is “proper”. If that looks good then the developer supplied function would call the Thali ACL engine with the arguments (“CreateRecord”, “create”).
Thali’s ACL engine would use the data from the view to see if it can find an ACL with entry ID “ACLEntry-CreateRecord” and with the “create” permission for the requester, in this case, Aviva. The ACL engine is smart enough to do group resolution so it can figure out that Aviva is a member of group GUID1 and therefore has create permission. The ACL engine would then return either true or false depending on what it found, in this case it would return true and let the request continue.
When the “conversation-[ConversationID]” record is created the view logic would then need to output a new record into the view of the form:
Key ACLEntry-ConversationMembers-1234567
Value { “member”: [“Aviva’s public key hash”, “Britney’s public key hash”, “Celine’s public key hash”] }
Then when, for example, Britney tries to submit a record in a _bulk_docs with Celine’s public key hash the ACL engine would call the developer’s code with the arguments (“Post”, “conversations/_bulk_docs”, “1234567-[Celine’s key hash]-23”, documentBody) and the developer code would call the Thali ACL engine with the arguments (“ACLEntry-ConversationMembers-1234567”, “member”) which would be passed to the ACL engine who would find the matching entry and return true. Notice that we didn’t care that the entry’s ID was Celines. We just cared that the authenticated requester was a member of the conversation.
For those feeling particularly brave you can check out the httpkey and thaligroup URL schemes for portable ways to identify users and groups.
7 Conclusion
This is a very flexible model but I am really concerned that it is too complex. ACLs are really easy to get wrong and usually with unpleasant consequences. But I just don’t have enough feel for how people want to use Thali to be sure of exactly how we could simplify things without also crippling various scenarios. Feedback is always welcomed, but especially on this topic.
One thought on “Peer to Peer, ACLs and PouchDB”