In the previous post in this series we explored the current privilege escalation model used in OpenStack Compute (Nova), and discussed its limitations. Now that we are able to plug an alternative model (thanks to the root_helper option), we’ll discuss in this post what features this one should have. If you think we need more, please comment !
The most significant issue with the current model is that sudoers filters the executable used, but not the arguments. To fix that, our alternative model should allow precise argument filtering so that only very specific commands are allowed. It should use lists of filters: if one matches, the command is executed.
The basic CommandFilter would just check that the executable name matches (which is what sudoers does). A more advanced RegexpFilter would check that the number of arguments is right and that they all match provided regular expressions.
Taking that concept a step further, you should be able to plug any type of advanced filter. You may want to check that the argument to the command is an existing directory. Or one that is owned by a specific user. The framework should allow developers to define their own CommandFilter subclasses, to be as precise as they want when filtering the most destructive commands.
In some cases, Nova runs, as root, commands that it should just run as a different user. For example, it runs kill with root rights to interact with dnsmasq processes (owned by the nobody user). It doesn’t really need to run kill with root rights at all. Filters should therefore also allow to specify a lower-privileged user a specific matching command should run under.
Shipping filters in Nova code
Filter lists should live within Nova code and be deployed by packaging, rather than live in packaging. That allows people adding a new escalated command to add the corresponding filter in the same commit.
Limiting commands based on deployed nodes
As mentioned in the previous post, nova-api nodes don’t actually need to run any command as root, but in the current model their nova user is still allowed to run plenty of them. The solution for that is to separate the command filters based on the type of node that is allowed to run them, in different files. Then deploy the nova-compute filters file only on nova-compute nodes, the nova-volume filters file only on nova-volume nodes… A pure nova-api node will end up with no filters being deployed at all, effectively not being allowed any command as root. So this can be solved by smart packaging of filter files.
Missing features ?
Those are the features that I found useful for our alternative privilege escalation model. If you see others, please comment here ! I’d like to make sure all the useful features are included. In the next post, we’ll discuss a proposed Python implementation of this framework, and the challenges around securing it.
In this series, I’ll discuss how to strengthen the privilege escalation model for OpenStack Compute (Nova). Due to the way networking, virtualization and volume management work, some Nova nodes need to be able to run some commands as root. To reduce the effects of a potential compromise (attacker being able to run arbitrary code as the Nova user), we want to limit the commands that Nova can run as root on a given node to the strict necessary. Today we’ll explain how the current model works, its limitations, and the groundwork already implemented during the Diablo cycle to improve that.
Current model: sudo and sudoers
Currently, in a typical Nova deployment, the nodes run under an account with limited rights (usually called “nova”). When Nova needs to run a command as root, it prepends “sudo” to the command. The nova packages of your distribution of choice are supposed to ship a sudoers file that contains all the commands that nova is allowed to run as root without providing a password. This is a privilege escalation security model which is pretty well-known and easy to audit.
Limitations of the current model
That said, in the context of Nova, this model is very limited. The sudoers file does not allow to efficiently filter arguments, so you can basically pass any argument to the allowed command… and some of the commands that nova wants to use are rather open-ended. As an example, the current nova_sudoers file contains commands like chown, kill, dd or tee, which are more than enough to compromise a target system completely.
There are a couple other limitations. The sudoers file belongs to the distributions packaging, so it’s difficult to keep it in sync with the rest of Nova code when someone wants to add a privileged command. Last but not least, the same nova_sudoers file is used for any type of Nova node. A Nova API server, which does not need to run any command as root, is still allowed to run all the commands that a compute node requires, for example. Those other limitations could be fixed while still using sudo and sudoers files, but the first limitation would remain. Can we do better ?
Substitute a wrapper to sudo
To be able to propose alternative privilege escalation security models, we first needed to be able to change all the “sudo” calls in the code and make them potentially use something else. That’s what I worked on late during the Diablo timeframe: creating a run_as_root option in nova.utils.execute that would use a configurable root_helper command (by default, “sudo”), and force all the existing calls to go through that (rather than blindly calling “sudo” themselves).
Thanks to the default root_helper, everything still behaves the same, but now we have the possibility to use something else, if we can be smarter than sudoers files. Like call a wrapper that will do advanced filtering of the command that nova wants to use. In part 2 of this series, we’ll look into a proposed, alternative Python-based root_helper and open discussion on its security model.
Last week saw the delivery of the first milestone of the Essex development cycle for Keystone, Glance, Horizon and Nova. This early milestone collected about two months of post-Diablo work… but it’s not as busy in new features as most would think, since a big part of those last two months was spent releasing OpenStack 2011.3 and brainstorming Essex features.
Keystone delivered their first milestone as a core project, with a few new features like support for additional credentials, service registration and using certificate-based SSL client authentication to authenticate services. It should be easier to upgrade from now on, with support for database migrations.
Glance developers were busy preparing significant changes that will land in the next milestone. Several bugfixes and a few features made it to essex-1 though, including the long-awaited SSL client connections. It also moved to UUID image identifiers.
The Nova essex-1 effort was mostly spent on bugfixing, with 129 bugs fixed. New features include a new XenAPI SM volume driver, DHCP support in the Quantum network manager, and optional deferred deletion of instances. Under the hood, the volume code was significantly cleaned up and XML templates were added to simplify serialization in extensions.
Essex-1 was also the first official OpenStack milestone for Horizon, also known as the Dashboard. New features include a instance details page, support for managing Nova volumes and a new extensible modular architecture. The rest of the effort was spent on catching up with the best of core projects in internationalization, developer documentation, and QA (frontend testing and JS unit tests).
Now, keep your seatbelt fastened, as we are one month away from essex-2, where lots of new development work is expected to land !
The 200 open seats for the Essex Design Summit were all registered in less than 9 days ! If you missed the boat, you can still register on the waiting list at http://summit.openstack.org.
For the last seats we need to give priority to existing OpenStack developers and upstream/downstream community members, so the waiting list will be reviewed manually. You will receive an email if you get cleared and get one of the very last seats for the summit.
Sometime next week, the website should allow registered attendees (as well as attendees on the waiting list) to propose sessions for the summit, so stay tuned !
August was very busy for OpenStack Nova and Glance developers, and the culmination of those efforts is the delivery of the final feature milestone of the Diablo development cycle: diablo-4.
Glance gained final integration with the Keystone common authentication system, support for sharing images between groups of tenants, a new notification system and i18n. Twelve feature blueprints were completed in Nova, including final Keystone integration, the long-awaited capacity to boot from volumes, a configuration drive to pass information to instances, integration points for Quantum, KVM block migration support, as well as several improvements to the OpenStack API.
Diablo-4 is mostly feature-complete: a few blueprints for standalone features were granted an exception and will land post-diablo-4, like volume types or virtual storage arrays in Nova, or like SSL support in Glance.
Now we race towards the release branch point (September 8th) which is when the Diablo release branch will start to diverge from a newly-open Essex development branch. The focus is on testing, bug fixing and consistency… up until September 22, the Diablo release day.
How to control what gets into your open source project code ? The classic model, inherited from pre-DVCS days, is to have a set of “committers” that are trusted with direct access while the vast majority of project “contributors” must kindly ask them to sponsor their patches. You can find that model in a lot of projects, including most Linux distributions. This model doesn’t scale that well — even trusted individuals are error-prone, nobody should escape peer review. But the main issue is the binary nature of the committer power: it divides your community (us vs. them) and does not really encourage contribution.
The solution to this is to implement a gated trunk with a code review system like GitHub pull requests or Launchpad branch merge proposals. Your “committers” become “core developers” that have a casting vote on whether the proposal should be merged. Everyone goes through the peer review process, and the peer review process is open for everyone: your “contributors” become “developers” that can comment too. You reduce the risk of human error and the community is much healthier, but some issues remain: your core developers can still (wittingly or unwittingly) evade peer review, and the final merge process is human and error-prone.
The solution is to add more automation, and not trust humans with direct repository access anymore. An “automated gated trunk” bot can watch for reviews and when a set of pre-defined rules are met (human approvals, testsuites passed, etc.), trigger the trunk merge automatically. This removes human error from the process, and effectively turns your “core developers” into “reviewers”. This last aspect makes for a very healthy development community: there is no elite group anymore, just a developer subgroup with additional review duties.
In OpenStack, we used Tarmac in conjunction with Launchpad/bzr code review in our first attempt to implement this. As we considered migration to git, the lack of support for tracking formal approvals in GitHub code review prevented the implementation of a complex automated gated trunk on top of GitHub, so we deployed Gerrit. I was a bit resisting the addition of a new tool to our toolset mix, but the incredible Monty Taylor and Jim Blair did a great integration job, and I realize now that this gives us a lot more flexibility and room for future evolution. For example I like that some tests can be run when the change is proposed, rather than only after the change is approved (which results in superfluous review roundtrips).
At the end of the day, gated trunk automation helps in having a welcoming, non-elitist (and lazy) developer community. I wish more projects, especially distributions, would adopt it.
No rest for the OpenStack developers, today saw the release of the July development efforts for Nova and Glance: the Diablo-3 milestone.
With a bit more than 100 trunk commits over the month, Nova gained support for multiple NICs, FlatDHCP network mode now support a high-availability option (read more about it here), instances can be migrated and system usage notifications were added to the notification framework. Network code was also refactored in order to facilitate integration with the new networking projects, and countless fixes were made in OpenStack API 1.1 support.
We have one more milestone left (diablo-4) before the final 2011.3 release… still a lot to do !
About a month ago I commented on the features delivered in the diablo-1 milestone. Last week we released the diablo-2 milestone for your testing and feature evaluation pleasure.
Most of the changes to Glance were made under the hood. In particular the new WSGI code from Nova was ported to Glance, and images collections can now be sorted by a subset of the image model attributes. Most of the groundwork to support Keystone authentication was done, but that should only be available in diablo-3 !
Those same initial Keystone integration steps were also done for Nova, along with plenty of other features. We now support distributed scheduling for complex deployments, together with a new instance referencing model. Was also added during this timeframe: support for floating IPs (in OpenStack API), a basic mechanism for pushing notifications out to interested parties, global firewall rules, and an instance type extra specs table that can be used in a capabilities-aware scheduler. More invisible to the user, we completed efforts to standardize error codes and refactored the OpenStack API serialization mechanism.
And there is plenty more coming up in diablo-3… scheduled for release on July 28th.
There was a bit of discussion lately on feature-based vs. time-based release schedules in OpenStack, and here are my thoughts on it. In a feature-based release cycle, you release when a given set of features is implemented, while in a time-based release cycle, you release at a predetermined date, with whatever is ready at the time.
Release early, release often
One the basic principles in open source (and agile) is to release early and release often. This allows fast iterations, which avoid the classic drawbacks of waterfall development. If you push that logic to the extreme, you can release at every commit: that is what continuous deployment is about. Continuous deployment is great for web services, where there is only one deployment of the software and it runs the latest version.
OpenStack projects actually provide builds (and packages) for every commit made to development trunk, but we don’t call them releases. For software that has multiple deployers, having “releases” that combine a reasonable amount of new features and bugfixes is more appropriate. Hence the temptation of doing feature-based releases: release often, whenever the next significant feature is ready.
Frequent feature-based releases
The main argument of supporters of frequent feature-based releases is that time-based cycles are too long, so they delay the time it takes for a given feature to be available to the public. But time-based isn’t about “a long time”. It’s about “a predetermined amount of time”. You can make that “predetermined amount of time” as small as needed…
Supporters of feature-based releases say that time-based releases are good for distributions, since those have limited bearing on the release cycles of their individual subcomponents. I’d argue that time-based releases are always better, for anyone that wants to do open development in a community.
Time-based releases as a community enabler
If you work with a developer community rather than with a single-company development group, the project doesn’t have full control over its developers, but just limited influence. Doing feature-based releases is therefore risky, since you have no idea how long it will take to have a given feature implemented. It’s better to have frequent time-based releases (or milestones), that regularly delivers to a wider audience what happens to be implemented at a given, predetermined date.
If you work with an open source community rather than with a single-company product team, you want to help the different separate stakeholders to synchronize. Pre-announced release dates allow everyone (developers, testers, distributions, users, marketers, press…) to be on the same line, following the same cadence, responding to the same rhythm. It might be convenient for developers to release “whenever it makes sense”, but the wider community benefits from having predictable release dates.
It’s no wonder that most large open source development communities switched from feature-based releases to time-based releases: it’s about the only way to “release early, release often” with a large community. And since we want the OpenStack community to be as open and as large as possible, we should definitely continue to do time-based releases, and to announce the schedule as early as we can.
There are multiple available delivery channels available to install OpenStack packages on Ubuntu.
First of all, starting with 11.04 (Natty), packages for OpenStack Nova, Swift and Glance are available directly in Ubuntu’s official universe repository. This contains the latest release available at the time of Ubuntu release: 2011.2 “Cactus” in 11.04. If you don’t use Ubuntu 11.04, or if you want a more recent version, you’ll need to enable one of our specific PPAs, please read on.
OpenStack release PPA
If you want to run 2011.2 on 10.04 LTS (Lucid) or 10.10 (Maverick) you can use the ppa:openstack-release/2011.2 PPA. Enabling it is as simple as running:
$ sudo apt-get install python-software-properties $ sudo add-apt-repository ppa:openstack-release/2011.2 $ sudo apt-get update
Starting with the Diablo cycle, we do a coordinated OpenStack release every 6 months. That said, Swift releases stable versions more often. You can get the latest Swift release through the ppa:swift-core/release PPA. Just replace the PPA name in the above example.
Also starting with the Diablo cycle, Nova and Glance deliver a development milestone every 4 weeks. If you want to test the latest features, you can enable those PPAs: ppa:nova-core/milestone or ppa:glance-core/milestone.
PPAs for testers and developers
Just before we deliver one of these intermediary releases or milestones, we use a specific “milestone-proposed” PPA to do final QA on release candidates. Enabling this one and reporting issues will help us in delivering high-quality milestones. Just enable ppa:nova-core/milestone-proposed, ppa:glance-core/milestone-proposed or ppa:swift-core/milestone-proposed.
Finally, for all projects and for every code commit, we generate a package in the trunk PPA. If you’re a developer, or like living on the bleeding edge, you should enable those: ppa:nova-core/trunk, ppa:glance-core/trunk or ppa:swift-core/trunk.
I hope this will help you select the best delivery channel for your use case, depending on whether you’re deploying, evaluating, helping in QA or actively developing. For future reference, the list of PPAs is maintained on the wiki.