Over the past few months I’ve gotten increasingly interested in REST services. Maybe it’s the beautiful simplicity of the concept, or the fact that I can get obsessive about things like URL schemas and HTTP verb usage, but the technology has really gotten my attention. In addition to spending way too much time reading about the merits of POST v. PUT when uploading content, or when you should (or shouldn’t) put the version number of your API in the URL, I’ve also gotten into in authentication and authorization of requests.

Software security in general has always been a topic that interest me enough to want to read up and self-educate, but I’ll also admit to never being too interested in all the gory details. So while working on the REST API that I’ve been playing with at work, I was hoping to find something that would satisfy the following authentication/authorization requirements:

  • making sure the calling user is who they say they are
  • having some piece of data that uniquely identifies the calling user
  • supporting multiple authentication services (Live ID, Yahoo, Google, etc)

I couple of months ago I did some initial reading on OAuth which looked rather promising, but then when attending a session on OAuth at Web 2.0 in San Francisco, I go scared away from the technology when the speaker said something to the effect of “OAuth should be killed”, not a ringing endorsement from someone who was much more knowledgeable about the technology. I looked into Facebook Connect a little, then someone at work told me about Windows Azure Access Control Service (ACS). A quick read through some docs on the codeplex site and other sources on the web had me intrigued. Not being one to read lots of documentation (or instructions in general) I jumped into trying to integrate ACS into my REST API, which comprises of read/write operations for files and has some minimal user management capabilities too.

The results are promising.  Once I got my head wrapped around the various components and their levels of interaction (which are described in the sequence diagrams on the codeplex site), the integration with my service wasn’t too painful although I did find that the web examples were a little lacking, specifically around browser (HTML/Javascript) based clients interacting with a site or service that is using ACS for authentication + authorization.  After some experimenting, I got the whole thing work by doing the following (I may end up posting the code at some point if there’s interest, but hopefully a description is good enough for starters):

  • Creating a new Access Control Namespace (done via the Azure Management site)
  • Configuring the new Access Control Namespace for my service
    • Picking the identity providers that I wanted to use (Live ID, Yahoo and Google)
    • Adding my service as a Relying Party Application
    • Configuring my service’s Relying Party Application settings, which includes:
      • Setting the Realm and Return URL’s (I used the default.aspx page for both, this is where ACS will re-direct the browser to upon completion of authentication via the identity provider and authorization token minting via ACS).
      • Setting the Token format (I used SWT since it’s a little more web-friendly due to it being plain text and not XML)
      • Setting the Rule groups (this is where I tell ACS what I want it to add to the SWT token that comes back to my service upon successful authentication)
      • Setting the Token Signing Key (will need this in my application to enable it to decrypt the SWT token that ACS passes back upon successful authentication)
  • Adding a link from my site to an ACS hosted page that allows the user to select their preferred authentication provider and then re-directs the user to the providers login page.  The ACS management page generates this link in the Development -> Application Integration page, just copy/past the link into a page on your site and ACS does the rest.
  • Adding parts of the shared code (that’s in the  Management\ManagementService\Common directory of the downloadable sample code from codeplex)
    • Gets SWT token in the incoming request
    • Saves the SWT token in cookie to make it accessible to subsequent requests on my service
    • Checks for authorization status in the SWT token, a token contains valid authorization status if it has:
      • The same HMAC signature that was generated by ACS when it was encrypted (this is where my service code needs the Token Signing Key)
      • Non-expired data
      • The issuer is trusted (this is configured in my service code)
      • The audience is trusted (this is configured in my service code)

With all of this, I have a WCF REST service that leverages ACS as the authentication/authorization provider.  I have the code setup to allow un-authenticated reads but require authentication + authorization on writes.  I can let users use either of Live ID, Google or Yahoo for login support (to minimize the chances of needing to sign up for a new login account) and can get the user’s email address, which I use as a means of identifying users internally, from the SWT token that comes back, pretty cool stuff!

For this post,  I’m going to write about infrastructure and it’s value (both positive and negative) in the lifecycle of developing software.  My goals are to write this in a way that applies to virtually any project, large or small across a divers set of technologies, but ultimately I’ll be speaking from my most recent experiences which are with reasonably large scale (100s of developers) projects.

Since the term can easily mean different things to different people, I’ll start with defining the term…wikipedia is always a good place to start, so here’s their entry for the word (in the generic sense, not necessarily software specific):

Infrastructure is the basic physical and organizational structures needed for the operation of a society or enterprise,[1] or the services and facilities necessary for an economy to function.[2] The term typically refers to the technical structures that support a society, such as roads, water supply, sewers, electrical grids, telecommunications, and so forth. Viewed functionally, infrastructure facilitates the production of goods and services; for example, roads enable the transport of raw materials to a factory, and also for the distribution of finished products to markets and basic social services such as schools and hospitals.[3] In military parlance, the term refers to the buildings and permanent installations necessary for the support, redeployment, and operation of military forces.[4]

If we convert that to “software speak” we can bucket at least file the following as infrastructure (note: there may be more, just going for the obvious ones here):

  • Build hardware and software – all the tools + scripts, and the hardware they run on, that are required to take your code and turn it into a deployable package (i.e. a binary for a phone app, or a collection of Javascript and PHP files for a web app).
  • Deployment hardware and software – all the tools + scripts, and the hardware they run on, that are required to get your deployable package and, well, deploy it onto representative hardware.  Note that I’m being careful not to assume we’re just talking about web sites and web services here, deployments can go to phones or other non-server hardware too.
  • Test hardware and software – all the tools + scripts, and the machines they run on, that are required to test your deployed packages.
  • Reporting hardware and software – ultimately all of the above need a place to record results, both to allow for diagnostics, but also detecting trends over time, this all falls into your reporting infrastructure.
  • Source control and work item tracking hardware and software – projects of any reasonable complexity need a way to track bugs and or tasks (many smaller teams track the latter in more Agile ways), as well as source code.

That looks like a good list to start with, so now the onto the “value” part of the post.  Why do software development projects need all this stuff?  Why do we spend time and money on infrastructure when in the abstract, it doesn’t all add direct value to the end product? The value of infrastructure in software, is just like the value of infrastructure in the broader sense (using the definition from Wikipedia above)...it allows us to do our jobs, and poor infrastructure can often prevent us from doing our jobs.  I will readily admit that there’s a point of diminishing returns with infrastructure, but at least in my experience (which spans working at large and small companies) we too frequently forget how valuable good infrastructure is.  For example, imagine the scenario where you have a team of say 6 developers, all working on a feature that involves:

  • Developing a database to store data (aka data tier)
  • Developing a service to retrieve the data over HTTP (aka services tier)
  • Developing a web page to render the data in a browser (aka client tier)

Sounds pretty simple and the kind of system that many people work on everyday.  A feature that involves enough complexity to require 6 developers to work on it will need some coordination and lots of testing, so we’ll need a way to coordinate the work across the team.  This coordination needs to happen both inside a tier (so two or more devs can work on the data tier simultaneously) and across tiers (so two or more devs can work on client/services interactions).  There’s more than one way to solve this problem but typically you’d want some infrastructure (as defined above) to allow both the collaboration within a tier and integration across tiers.  If everything (builds, testing, deployment) is working great, the devs can happily check in their code, wait a short period of time for it to build, wait another short period of time for it to pass testing and then have it automatically deployed to a server so others can test and/or integrate with their changes.  Since there are several stages in the infrastructure pipeline and we already know that context switches are expensive, it behooves the business to have that pipeline execute both expediently and reliably.  If it’s slow, developers will go and do something else while they wait, and the cost of getting back into both the original task as well as the task they get into later goes up the longer it takes to finish.  If it’s unreliable, developers will have to spend time troubleshooting and fixing the infrastructure when they could be writing code that directly adds value to the business/feature.

Since most developers don’t work for free, what would you rather have your team doing, writing code for your feature, or waiting on slow infrastructure or troubleshooting unreliable infrastructure?