Previously I waxed poetic about the amazing powers of Serval to create mesh based Internet infrastructure for developing and less developed countries (LDCs). The thesis being that if we had meshes of Wifi endpoints that could move data around without charge then people could have local applications on their smart phones that run peer to peer and could take advantage of this infrastructure. But how do we build those apps? That is where Thali comes in. But yes, we need more. See below.
1 Local app development
Various folks are working on the idea of sending trainers to developing countries and especially to less developed countries (LDCs) to train people in how to write software. But write software to do what? Most software development in the developed world is focused on the Internet and people in LDCs don’t have terribly good access to the Internet.
So the idea is to train people to write code that is useful to them locally. If we have the Wifi mesh infrastructure we previously discussed then this provides an infrastructure that applications can run over to enable collaboration and communication. And none of this precludes access to the Internet. The apps should be able to leverage the Internet when it is available but remain fully functional when it’s not. This is beyond the “Offline First” model. This is the “always available” model.
So if I want to exchange a status update with someone who is reachable via my mesh then I should be able to do so even if I’ve run out of Internet minutes or don’t have any Internet access at all.
This is where Thali comes in. It provides a peer to peer infrastructure that enables applications to exchange information over the Wifi endpoints directly without any Internet in a secure/authenticated/authorized manner.
2 Development Environment
Right now Thali is a Cordova plugin that runs on iOS and Android. While iOS is perhaps not ideal for LDCs for cost reasons, Android is. And by being a Cordova plugin Thali lets the developer write their application’s UX in the Webview using HTML/Javascript/CSS. It’s pretty hard to argue against training folks to write code using web technologies. And thanks to its JXcore backend, which runs an old version of Node (Node 0.10 with some features from 0.12) the service logic that enables P2P capabilities is itself written in Javascript. That same node infrastructure is available to the Thali developer for long running service logic. Again, hard to argue against training people in web technologies.
But if we are going to train people to write Thali apps we have to figure out what their development environment is. The first thing it that when the developer is writing code we have to assume they are not connected to the Internet. Even if there is Internet, the bandwidth is likely to be incredibly limited. So however we plan on having developers we train in LDCs write code for Thali, all the tools and infrastructure they need will have to show up on a thumb drive.
Right now my working assumption is that any training done in villages, towns, etc. in LDCs will be done by people who will show up with some basic equipment. They will have the aforementioned thumb drive pre-loaded with all the development tools needed to build a Thali application. They will also show up with some cheap PC hardware to enable people to write code with a keyboard and some kind of reasonably sized monitor. The assumption is that the students will get to keep the PCs when they finish the course.
Yes, we could enable Thali development right on the phone itself. It is doable. The front end is Javascript/HTML/JS which is all interpreted and the back end is Node which is interpreted. So we could put an editor on the phone and let people just edit the code of an app they side loaded on their Android phone. That way we wouldn’t need those PCs. But writing code on a phone isn’t easy and while I suspect we’ll eventually go there I think we should start with the PC.
So as a practical question the challenge is - what’s on the thumb drive?
This is a variant on the “what books would you take if you were going to be stranded on a desert island?” question.
2.1 Making Thali consumable for normal people
Thali was always intended to be distributed as a NPM package. But it’s not that simple. Thali has code both for NPM as well as for Cordova. Also at some point because the community is so small we stopped even trying to figure out how to make it possible to deploy Thali as a prebulit package. Right now everyone that uses Thali actually builds Thali on their local machine and then uses their custom built version to create an application.
This model isn’t sustainable because for one thing the Thali build environment is MacOS only and “cheap PC” and “MacOS” don’t go together. In theory the build system should work just fine on Windows and Linux but it isn’t tested very often there.
So the challenge is - do we fix the install system for Thali so it can be installed as it was intended, as a NPM module into a Cordova project? Or do we just build a Cordova skeleton with the Thali bits already installed (we are well set up to do this) and put that on the thumb drive?
2.2 IDE
At a minimum we need Android Studio and the Android SDK. Yes, it’s huge. But we want to enable programmers, not limit them. We are building Android apps so we need to enable them to do so. There are other options though that could let us skip Android Studio and the Android SDK.
For example, we could ship an APK that is a Thali app that just has some tiny default content and a predefined key. We could then enable the dev to take over the app and swap out our content with their own HTML/Javascript/CSS and node_modules. That way they would never need to compile an Android app. But this brings in issues. For example, how do they distribute the app? These problems are solvable of course (here’s a hint, Thali’s heart is PouchDB, a sync engine).
But I can’t avoid thinking that the closer we stay to the chrome the less things are likely to blow up. I also like the idea of empowering the students to do anything they want, not just what we let them.
So starting with Android Studio and the Android SDK seems like a good idea.
The next challenge is - how should the developer edit HTML/Javascript/CSS? What about Node?
There are a ton of open source IDEs that support HTML/Javascript/CSS and Node. Which is best? I honestly don’t know. I can’t stand front end programming so I mostly just need a Node editor and I use IntelliJ and VSCode for that. But whatever. We need to pick something, what should it be?
And keep in mind that this all has to work while the user is 100% offline. So, for example, NPM really, really wants to connect to the net. Thankfully there are work arounds that we already have experience with for other reasons like Sinopia. But are there any other gotchas hiding here with these IDEs that will make them non-functional if they are offline?
2.3 Reference Material
At the bare minimum we have to have:
- Android Docs - Everything the dev needs to know about Android
- Cordova Docs- Thali is a Cordova app after all
- Mozilla Developer Network - Everything the dev needs to know about HTML, CSS and Javascript
- JXcore (read: Node) docs - References for the versions of all the Node APIs as supported in JXcore
- NPM docs- Using Node? You have to use NPM even if we are running it locally.
- Thali Website- I’m not sure how useful the site actually is. Which is kind of sad. But it seems appropriate to include. Note that this is a Github site, the source is at https://github.com/thaliproject/thali. And if we are including this shouldn’t we include all of our main Github source code depots (there are several of them) so that the developer can see the Thali source code and even build their own versions of Thali? This requires a lot of code though. That and currently our build scripts pretty much only work on a Mac.
Is there content we can pull in from Stack Overflow? Also depending on what libraries we decide to include in the next section we will need to pull in their reference information as well.
2.4 Preloaded Libraries
All I know about web development frameworks is that Angular 1.x made me really unhappy. But that’s just me. Should we include Angular? Ember? React? What?
What cordova plugins should we include?
What about NPM? NPM libraries tend to be reasonably small. Maybe we include the top 100 libraries? What about their docs? We would need to automate pulling them in.
3 ACLs and Databases
Thali currently has a very simple data/security model. We support synching one database and our ACL model is all or nothing. Either someone is recognized in which case we will pull all their changes to all records. Or they are not in which case they do nothing. For the kind of applications that our partner Rockwell Automation is writing this is barely o.k. But for the kind of applications that we expect our future developers to write, it just won’t work.
For example, let’s say a developer wants to make a social networking app. Do we really want everyone that a user trusts able to sync all records on that user’s phone from all other user’s the developer trusts?
Or let’s say that a developer wants to make a market application to let people advertise items or services they want to sell/trade. With the way things work now someone could edit someone else’s ad!
We have to have a more sophisticated model. We’ve known about this problem for a long time in Thali but have been ignoring it. Now we can’t. See here for lots more details.
4 Distributing and Updating Applications
So our dev has written an awesome application that is powered by Thali and ready to rock the world. How do they hand it out?
Solving this problem does go back to our previous questions about how we want to build a Thali app. If we empower people with Android Studio then they would be building a stand alone Android APK which can be distributed. The good news is that once some hidden settings are flipped on a phone it can just download and install an APK from anywhere. So if we are going with APKs then we need to think about where the APK is hosted. Do we post it and download it to the Serval infrastructure? Does this mean we run HTTP servers on the serval endpoints? If so then we have to hook those servers into Serval’s mesh infrastructure so the app imagines can be moved around by Rhizome.
Another alternative is that people can exchange apps directly via the Wifi endpoint. In which case the HTTP server would be on the phone and we already have one of those. But it means that people can only get the apps from someone with the app instead of posting it to Serval infrastructure. Or maybe we do both?
But now how do we update the APK? I’m not even sure if the APK system is smart enough if we side load to recognize that an APK is a new version of an older APK it already has. We need to test that.
Another alternative is to use a system like F-Droid. It is an open source project that is intended to allow for the managed distribution of applications on Android. It supports local Wifi. It understands how to authenticate apps. It understands versioning/updates. But now we are requiring users to download this one app, F-Droid, in order to get a store that can be used locally to get all the other apps. Is that too big a barrier to entry? Also how do we hook F-Droid up to Serval’s infrastructure? Keep in in mind that the Serval Mesh Extender endpoints are really just OpenWRT Linux. So we can do things like run a HTTP server there and possibly connect that server to Rhizome so that F-Droid sees something it might recognize. I suspect we would still have to teach F-Droid how to look for these HTTP servers not to mention hooking them up to Rhizome.
A third option is that we have a single Thali application and inside of it are basically sub-applications, each one contains the Javascript/CS/HTML files for that specific app. Now we can distribute apps as just synching. But this brings up some very ugly security issues since we don’t have good ways to isolate node apps in particular from each other. Do we really want to create our own sand box system or use Android’s system which is based on having separate apps? Btw, even if we decide to make the apps into sub-apps and use sync to distribute there are still real challenges. What happens if we are getting an application update and the sync fails 1/2 way in? Now the existing app as it exists on our phone is in a 1/2 baked state. We can play versioning games with folders to fix this but then we need a way to validate if a folder is complete (easy enough btw) and we need to eat up extra space. Still really doable but we should be aware of the challenge.
5 Leveraging Serval
I expect that before you read this section you first will go and read my overview on Serval and how it relates to Thali.
There, having read that now we can talk some details. The first issue is - do we try to use IP or Rhizome over Serval?
If we knew we were building a tiny mesh I would be tempted to use IP bridged over MDP/MSP. Right now, for example, if one sends a multicast UDP packet it won’t be forwarded out over the IP to MDP bridge but we could fix that. But I think the IP approach is a mistake. The problem is that Thali’s existing discovery mechanism (using beacons) was strictly intended for point to point hyper local discovery and/or discovery that we knew would only involve a relatively small number of devices. But that isn’t the case here. We are intentionally trying to build an infrastructure that will scale to at least the hundreds of thousands of people (eventually).
To scale to that level we can’t use a discovery mechanism, such as beacons, which depends on everyone advertising who they are looking for and then expecting every person in the system to examine every advertisement (including multiple beacons and lots of expensive crypto) to see if the advertisement is for them.
So we have two (not necessarily mutually exclusive) choices. One choice is to focus on low latency meshing and adapt a system like Serval’s Cooee to let us discover whenever someone we want to talk to is currently available. We would need to play some tricks so that Cooee doesn’t turn into a way to discover people’s location. For example, if everyone has a permanent ID then when I want to find someone I can try to send them a message via Cooee and see where it goes. A much better solution is to have a constantly rotating ID. But then when I want to find someone I have to know their current ID which requires a separate discovery mechanism. This is all doable but I would argue that we would be better off focusing on store and forward. Other than for real time audio/video we can use Serval’s store and forward service Rhizome to meet all out needs.
The real question is - how do we make Rhizome work? There are two questions we have to answer.
First, how do we address Rhizome packages?
Second, how do we hook up Thali to Rhizome?
For the first question I suggest for the moment we stick with simplicity even though it is not secure and we will have to do better in the future. For now I think packages should contain the public key of the sender and receiver. We can even include a beacon to prove that the sender is who they say they are and that the receiver field wasn’t altered and can produce the session key used to encrypt the content. This isn’t a good long term solution because it exposes too much traffic analysis information but it’s a good place to start.
For the second question we need to change how we think about PouchDB. Normally we just focus on real time data synchronization. That is, after all, what CouchDB’s protocol (the one PouchDB uses) is designed for. But that doesn’t work in a store and forward mesh like Serval. So instead what we have to do is switch to a stream based sync protocol.
Imagine that user A updates their local PouchDB with some content that is addressed to user B. For simplicity sake let’s assume that user A has never talked to user B before (at least in the context of the current application). In that case user A would run a filter on their PouchDB change log starting at the first record (since they haven’t talked to user B before) and filtering out any records that user B doesn’t have permission to see. The result will probably be the single record that user A wanted to send to user B. We would then output this record (and any associated attachments), combine it with a header containing a (probably encrypted) change ID (e.g. what record in the change log this lookup got to) and this would then be encrypted, signed and stuck into a Rhizome package with a manifest identifying user A as the sender and B as the receiver.
Later on when user B is looking for Rhizome packages whose manifest are addressed to them they will pull down user A’s package and after validating it then would replay its contents against their local DB. Once this is done user B would create a new Rhizome package addressed to user A that just confirms they received and executed the content. That confirmation package from B to A would include the change ID that user A sent in their package.
When user A gets the confirmation package then user A would record that user B has confirmed it has received content up to that change ID. That way in the future when user A has new content they will only need to send content from the change ID that user B confirmed.
This begs the obvious question - what happens if user A sent a package to user B starting at change ID 0, doesn’t get a confirmation and now wants to send new information to user B? In theory we could use the versioning capability of Rhizome packages to replace the original package to B with a new package to B containing the updated information. We could even be really smart about this. For example, we could include the original package content (so we don’t have to recalculate it) and then append our new content. But this would minimally require downloading the original content again which isn’t cheap or easy. And it brings up questions regarding the package format, how does it handle these kind of multipart messages? And how do we secure it all? So more likely is that A will output a new package with a reference to the previous package specifying they have to be run in order.
Now Rhizome package delivery is best effort. It’s perfectly possible that the first package that A sent to B got lost and that the second package arrives. In that case B has some options. One option is to play the content in the second package. It will be incomplete of course but it could still be useful. But equally likely is that B has to generate a package back to A telling A that B hasn’t been able to find the first package so A needs to try again.
And yes, this is messy. And yes it’s going to require a careful state machine and some non-trivial code. But it isn’t brain surgery. If we get the states right then we will be fine and the hard part will be deciding on optimizations once we have the basics right.
Another fun question is - do we turn off Thali’s Wifi discovery/CouchDB support if we are using Serval? My guess is that the answer is no. If someone we want to exchange information with is right there on the same endpoint there is no need to push everything through Rhizome. But it is an interesting question.
6 Enabling Identity Exchange
In both Serval and Thali, identities are public keys. But how do we exchange public keys in the first place? In some cases this could be done through a third party. For example, if two users have both downloaded the same public chat app they might get a key through that. Of course the key can’t be trusted, but what can? In other cases the local village or refugee camp might actually host a telephone book of registered folks as an online service. That can also be used (and not trusted). But if we want high trust then two people need to exchange their keys directly. Thali has some old code that isn’t updated to our latest infrastructure that can support identity exchange using codes. I actually think this is the easiest to understand approach. You can see the spec for how it works here. But I see more and more folks in this area trying to use QRCodes instead. Mostly because it doesn’t require any kind of radios and is more resistant to attacks. We had real trouble scanning QRCodes with useful amount of data in the past but honestly we were trying to do the QRCode scanning via a Javascript based scanner. So it’s probably worth looking at the Cordova QRCode scanner plugins and seeing if we can make that fly instead.
7 The Multi Thali App Challenge
If someone only has a single Thali app on their phone then life is pretty straight forward. They run the app which is always running its Thali node.js service in the background. In a Thali app there are certain basic things we do. One of them is exchange identities which is used to build up a phone book. Now if there is just one Thali app on a phone, all fine and good. But what if there are two or three Thali apps? Do they each have a different phone book? That seems kind of silly.
And wait, it gets better. How many identities does the user have? Do they have a different identity for every app? Because if they don’t then we have to share a user’s private key between apps in order to let them use the same identity across apps. No problems there right. :)
In addition there is a resource issue here. If one has two or three Thali apps on a phone then are they all running Node in the background, pinging the local Serval mesh extender end points? That is probably not ideal from a battery perspective.
In Thali we have always talked about the Thali Device Hub (TDH). This was a vision of having a single trusted application that would hold the user’s private key and would handle all network traffic. It would act as a neutral place to keep data so that the same data (such as phone books but also other information such as messages, calendar events, etc.) would be available to all apps with the right permissions on the phone.
Now to be fair this problem relates to our application distribution model. If we decide to make each Thali app a stand alone APK (mostly to benefit from Android’s security model) then we have this problem. If we decide to go with the “sub-app” model where Thali apps are just directories with web and node content that are all loaded from the same central application then this problem is less pressing but the model itself is ridiculously insecure.
But even if we build the TDH how do we make it work? Do we program Thali APKs to be stand alone and only if there is a TDH on the phone would we use the TDH? How do we recognize the TDH? If the user first downloads a Thali application, creates an identity and then downloads a TDH do we transfer the private key from the app to the TDH? Do we instead use a key mesh where the TDH generates a different key and we then produce an attestation where the keys say they trust each other? That later approach isn’t as nuts as it sounds. For example, if a user has 3 Thali apps and we don’t have a TDH anywhere then in theory the 3 apps could point at each other with reciprocal links they could publish to people in their application specific networks and inform them of the uniformity of identity. Or, put in something slightly closer to English, app A and B on the same user’s phone could each generate a signed statement where App A says “I public key A am the same identity as public key B” and vice versa. If a different users gets both assertions then (and only then) can they treat the identities in Apps A and B for that users as being the same. We could even support sync right on the device where apps can share their phone books with each other. In other words instead of having a single global TDH we just build a little sync mesh right on the device. The benefit of this approach is that it doesn’t require the user to download and configure yet another application.
8 Conclusion
So clearly there is a lot of road between here and completely thought out and fully functional peer to peer web environment. But none of the problems above require new science. They just require some though and some code. We know how to do that. So let’s get started!