The Umbraco Codebase: A Traveller's Guide
This article is several years old now, and much has happened in Umbraco land since then, so please keep that in mind while reading it.
"When are we off to the Holy Lands you ask? Oh no, it's not that trip - we leave for there in a couple of weeks. No, today we're venturing into the Umbraco code base! It might take us a bit of time but we're hoping for an epiphany at the end of it."
I was going to push that analogy further but to be honest it's already rather forced! However, that's the plan for today's page of the Umbraco advent calendar; to take a whistle-stop tour through the Umbraco code base, which will hopefully be useful for anyone considering contributing code to the project, or simply wanting to understand a little more about what goes on under the covers.
Umbraco has always been an open-source project and - particularly since the move to Github for the code hosting - has had a number of people in addition to the core team contributing everything from minor fixes to sizeable features.
Why are they doing this? Well, some might be doing it to "scratch an itch" and fix something that is troubling them or their clients. Others see it as one way to contribute to the success of the project that we all benefit from in different ways. When it comes to more direct benefits, there's a lot to learn from seeing how a project of this size is architected and managed. And last but not least - we're all geeks here of course - it can be rather fun. For most, I guess it's probably all of the above.
Can I contribute?
There's no doubt that seriously smart people are working with or as part of the HQ on the development of Umbraco and there are some major changes that have been and continue to be made that require long-term planning from a tightly organised team. But that still leaves plenty of scope for others to contribute. The open issue tracker contains many smaller scale feature requests and bug reports that they'd welcome solutions too.
I'd suggest that any competent developer willing to take a bit of time to get stuck into an issue will make valuable progress - whether alone, or in collaboration with others such as at the recent UK Festival hack day.
So the answer is almost certainly yes.
The first step is to register or log in to a GitHub account, find the Umbraco page and click to fork the repository. This gives you your own copy to work in and, when ready, submit a pull request back to the core project.
As the name suggests, GitHub supports the Git distributed source control system which you can either use from the command line - there's a only a small sub-set of instructions you generally need to use - or an application with a GUI wrapper such as Git Extensions.
There are excellent guidelines provided by Github and Umbraco themselves as to the process of forking, creating pull requests and keeping your working fork up to date with changes made by others.
Once submitted, your pull request will be reviewed by a member of the core team and either accepted as is or with modifications, or feedback provided.
Cracking open the code
Having forked the repository and cloned a local copy, despite the encouraging words above it can be a little daunting when you first open the solution (found in src\umbraco.sln) in Visual Studio.
There are quite a number of projects - many of which are retained for legacy and backward compatibility reasons rather than undergoing active development. Down the line - perhaps at Umbraco 8 or 9 - many of these will likely be removed. But clearly this has to be done with careful consideration and having given developers using the CMS plenty of time to be aware that methods they are using are obsolete and will be at some point be retired.
The Umbraco development guidelines gives a good summary of the important projects for ongoing development and a discussion in the developer group provides a some further detail on the legacy ones.
Umbraco make use of many of the features available in the AngularJS framework but even if you aren't particularly up to speed on that it's fairly straightforward to see what's going on. Starting from what is rendered on the screen, each component is made up of a view (the template) and a controller (the logic behind the view). So taking the shiniest new example, the Grid editor has a grid.html file associated with a grid.controller.js (found in src\views\propertyeditors\grid\).
If you open up this or any other of the controllers, you'll see it's defined as a function that is called with various parameters. The objects passed via parameters are all used by the controller in various ways, and are provided to it by a means of dependency injection by the AngularJS framework.
These dependencies are of various types. One example is a resource - an encapsulation of a particular object with methods to allow it to be created and saved via HTTP requests. The grid controller for example has a dependency on a media resource defined in media.resource.js (in src\common\resources\).
Other dependencies may be services. These provide encapsulations of logic that are used across many controllers. An example here is the dialog service defined in dialog.service.js (in src\common\services\) which provides common functionality for handling various modal and overlay dialog boxes.
The views themselves will also have reference to another class of component which are directives. These again encapsulate functionally that is shared across multiple UI elements (and are found in src\common\directives\).
This project will be most familiar to Umbraco developers as it's the main web application that actually runs the software. It's where your custom views, master pages or XSLT files are created and also contains the files related to the back-office itself.
Digging into the umbraco folder you'll come across a few things that you might have thought had been left behind - web forms and user controls? Well yes, it's not all shiny AngularJS and MVC yet! Although much of the back-office has been rebuilt to use these technologies - particularly the content, media and member sections - there is still some parts of the software still running using legacy web forms technology.
This is seamless to the end user due to some clever work that allows the re-use of these pages and controls within the AngularJS single page application implementation of the back-office - and effectively allowed Umbraco 7 to be released sooner than it would otherwise had been. No doubt in time though these will be obsoleted and the rest of the back-office rewritten to use the newer and consistent technologies.
The Umbraco.Web project contains all the Umbraco C# class files related to the web application, such as controllers, models and web APIs. You'll also find here the various classes that provide flexibility in the request pipeline. So there's a lot in here in other words, but let's trace one small route through this layer of the solution.
Going back to the AngularJS resources, I mentioned that they are able to retrieve and save details of a particular type using AJAX based HTTP requests. So for example content.resource.js contains a method called getById(), that retrieves a specific content node for a given Id. This translates to a controller method in ContentController (found in the Editors folder).
In turn this makes a request to a method in a service class in the core project, which will look rather familiar to many. It's exactly the same ContentService that's publically available and used when we are working with the database in our own Umbraco applications. This is one reason why the service APIs are so reliable and well thought out - they've been dogfooded in a sense as they are used throughout the Umbraco back-office itself.
The Umbraco.Core project contains some fundamental features of the content management system, in particular those that don't rely on being used in a web application context. The services mentioned previously are defined here, as are the classes involved with persisting data to the database. We can also find code involved in the saving and retrieving of files as well as various common helper classes used throughout the application
Following our example further the ContentService (found in the Services folder) also has a GetById() method which instantiates an instance of the ContentRepository (found in Persistance\Repositories). In turn, the Get() method of the repository is called within the context of a unit of work, passing through the node Id we want to retrieve content for.
Tracing further we get to the actual data access methods themselves, which utilise PetaPoco, a light-weight object relational mapper (ORM) that is used to generate the necessary SQL statements to pull back the required information from the database and instantiate an object of the type we want to return.
Last but not least, we come to the Umbraco.Tests project. As the name implies, this contains unit and integration tests providing coverage for many of the methods in the Umbraco.Core and Umbraco.Web projects.
The tests utilise NUnit and mostly inherit from common on some base classes that will instantiate the necessary contexts and dependencies that the tests require.
Rounding off our example, we can find tests that check the service and repository methods for retrieving a content instance by its Id, in ContentServiceTests and ContentRepositoryTest respectively.
Well, we got there. I hope that was a useful, albeit very top-line, introduction to some of the highways and by-ways of the Umbraco code base. Who knows, maybe it'll even inspire a new year's resolution or two and there'll be even more pull requests coming in 2015!
Andy is on Twitter as @andybutland