Joining Hands and Singing Merrily Part 1

When I came up with the abstract for the talk that this blog post is being written in preparation for, one of my peers offered the criticism that my writing might be too optimistic and cheerful. Despite the "kumbaya" message of the title, we don't want anyone to think that any of my colleagues or I might be lacking in the cynicism department. Joking aside, I think it's really important, as a developer, to retain a feeling of optimism about the state of security in software, while remaining grounded by an understanding of our own fallibility. Nonetheless, I firmly believe that we can do better, even if we aren't NASA engineers. In this series of blog posts, we'll be exploring all of the ways that the XFIL team at Stratum have been working to put into practice everything that is frequently preached about writing higher quality, more secure software. What would software development look like if security really was more than an afterthought?

In this first post, we’ll start by describing our XFIL project: what it is and what kind of security concerns it is faced with. We’ll also get the ball rolling on our discussion by describing the XFIL team’s design and review process. In future posts, we’ll look at the technology stack we are using, including our choices of programming languages, databases, and so on, as well as the two major ways we deal with authentication and the cryptography that makes that possible. Later, we’ll look at how we distribute authorization responsibilities throughout our system and make strong assertions about access rights using proof-of-work protocol designs and then close everything up by talking about how we handle access controls using capability-based security.

XFIL - A Case Study

For our exploration, we are going to use Stratum Security's XFIL project, of which I am the lead developer, as a case study to see what a security-focused development process looks like in the real world. There is already an abundance of excellent material on the subject, but I believe our experience can bring context into the everyday of software development.

XFIL is a tool for simulating attacker behavior, post-compromise. With an agent installed on a machine in a client's network, it will attempt to mimic the kinds of techniques a malicious actor might use to exfiltrate (in our case, fake) sensitive information out of the client's network and into our own ingestion services. The results of these tests will be collected and later analyzed in order to provide a scored security assessment. While that about covers the gist of what XFIL is, let's briefly look at what kind of security concerns we have regarding a couple of the major parts of our system.

Users

First and most obviously, we have a responsibility to protect our users. this includes protecting their accounts as well as protecting them against client side attacks. This broadly includes all of the usual web application security considerations that we see in lists like the OWASP Top 10.

Agent Identities and Data Integrity

We want our agent software to be as self-driven as possible. Due to the consequences of the next point, we have assumed slightly more risk with regards to maintaining the security of an agent's identity during the course of a test, however we would still like to be able to provide the strongest guarantees possible that we have implemented measures to prevent attackers from being able to impersonate an agent or affect the client's test results.

Dealing With Insecure Protocols

A big part of the tests that an agent will perform involve using inherently insecure protocols such as FTP and SMTP. As a result, we have to be particularly aware of what data we transmit over such protocols, and anticipate that any and all information going over the wire could be harvested and used to attack us, the application, or modify or influence the results of a test.

The Design Process

Over the course of building XFIL, about seven services of varying sizes have been developed from the ground up, with several more sub-services. Through the process of designing and implementing each service, we've refined a robust design process that has repeatedly made development very streamlined and well-directed. The process itself is not particularly unusual, but we do add emphasis to a couple of parts that a lot of development shops probably don't use at all or severely underestimate the value of.

Feature Lists and User Stories

Anyone who's done a course in software engineering is sure to recall hearing about user stories at length and might even, if their experience was like my own, remember hearing a lot about user stories as being a particularly effective tool for communicating and formalizing requirements with a client. During the beginning stages of the design of any service or tool, our team makes a point of getting our stakeholders together and brainstorming up a list of things whatever it is we set out to build has to be able to do. At first, this is a very barebones list, that might include things like

  • Allow for new test results to be uploaded by an agent
  • Allow for users to retrieve a collection of test results

and so on. From there, we move on to describing user stories and try to
outline the steps that an actor might take to invoke the features we've listed.

Threat Modeling and RFCs

This is where things get a lot more interesting. Threat Modeling is a process that I might define as something along the lines of

Decomposing an application into logical units and identifying points of potential weakness that could be exploited by an attacker with a given set of capabilities, and then determining means of eliminating high-risk vulnerabilities and mitigating any that cannot be eliminated.

Of course, this is not a perfect science, nor is it guaranteed to result in a completely vulnerability-free design. At the very least, it should be a part of a rigorous process of understanding the risks a design might face and using that understanding to iterate on your design until you reduce the threat to a tolerable level.

Ideally, your threat modeling process should be informed by at least three parties. Sometimes two will suffice, depending on the overlap in skill set, but you should really try to get the following three people at the table for this.

  1. Someone who deeply understands clients' needs and the kinds of risks it may be acceptable to assume
  2. Someone with a focus on security, ideally that which is specific to the space you are working in
  3. Someone familiar with the system architecture and who can suggest improvements to the existing design

Once our team has done enough back-and-forth about a design, I'll usually follow up by producing a document that describes the finalized design and summarizes the discussion that lead to the result being described. I find it is often best to post this in a place similar to Google Docs, where it will be readily available to be reviewed and commented on by other members of the team. This gives everyone another chance to ask questions, raise concerns, and suggest improvements before moving on.

Specifications, Implementation, and Testing

Software engineering guru, Joel Spolsky, argues in a four part essay that producing specifications for your software up front can, among other things,

  1. Reduce the development time of the project and
  2. Result in a higher-quality, more elegant solution to your problem

I believe he is absolutely correct in his assertions, and have found that it does significantly add to my confidence in the quality and correctness of the code I write. Once a decent specification has been knocked out, it serves as an "unquestionable authority" of sorts, and so when a bug is detected or some unusual behavior is observed and the question "what should this actually be doing?" comes up, the answer is almost always "what the spec says."

Even better, It's often possible to translate a specification directly into executable code by way of writing tests. I personally believe strongly in the value that good tests have to offer. Robert (Uncle Bob) Martin argues strongly to write tests first and makes many a good case about the safety guarantees they can buy you. Good tests make me more confident that I can make changes to my code without breaking things and that, when I think I'm done with something I'm really done.

I have found that a lot of developers will say that they don't need documentation Or specifications because "their code documents itself." This, to me, signals a deeply flawed understanding of the purpose of documentation. It's absolutely a given that your code should be written in such a way that anyone who reads it would be able to understand what a piece of code does and how it does it, but that is, at best, half of the story. Sometimes an argument is made that tests are not necessary because you should be writing small functions that are easy to reason are correct, or because there is a compiler or type system being used that will catch errors. Assertions like these against testing and documentation really miss the point and fail to acknowledge several key facts.

  1. Even the best developers write code that contains bugs
  2. Over time, people make mistakes and forget things
  3. People are not very good at keeping large amounts of information in their heads
  4. Documentation doesn’t exist just for other developers

In security, forgetting these facts can be a big deal. Write docs. Write tests.

Coming Up...

In the next part of this series, we'll take a look at the choices of programming languages we've made and how we're using them, as well as some of our complimentary software used for testing and deployment.

Stratum