The OpenStack Grizzly release of yesterday officially closes the Grizzly development cycle. But while I try to celebrate and relax, I can’t help from feeling worried and depressed on the hours following the release, as we discover bugs that we could have (should have ?) caught before release. It’s a kind of postpartum depression for release managers; please consider this post as part of my therapy.
We’d naturally like to release when the software is “ready”, “good”, or “bug-free”. Reality is, with software of the complexity of OpenStack, onto which we constantly add new features, there will always be bugs. So, rather than releasing when the software is bug-free, we “release” when waiting more would not really change the quality of the result. We release when it’s time.
In OpenStack, we invest a lot in automated testing, and each proposed commit goes through an extensive set of unit and integration tests. But with so many combinations of deployment options, there are still dark corners that will only be explored by users as they apply the new code to their specific use case. We encourage users to try new code before release, by publishing and making noise about milestones, release candidates… But there will always be a significant number of users who will not try new code until the point in time we call “release”. So there will always be significant bugs that are discovered (and fixed) after release day.
The best point in time
What we need to do is pick the right moment to “release”: when all known release-critical issues are fixed. When the benefits of waiting more are not worth the drawbacks of distracting developers from working on the next development cycle, or of abandoning the benefits of a predictable time-based common release.
That’s the role of the Release Candidates that we produce in the weeks before the release day. When we fixed all known release-critical bugs, we create an RC. If we find new ones before the release day, we fix them and regenerate a new release candidate. On release day, we consider the current release candidates as “final” and publish them.
The trick, then, is to pick the right length for this feature-frozen period leading to release, one that gives enough time for each of the projects in OpenStack to reach this the first release candidate (meaning, “all known release-critical bugs fixed”), and publish this RC1 to early testers. For Grizzly, it looked like this:
This graph shows the number of release-critical bugs in various projects over time. We can see that the length of the pre-release period is about right: waiting more would not have resulted in a lot more bugs to be fixed. We basically needed to release to get more users to test and report the next bugs.
The Grizzly is still alive
The other thing we need to have is a process to continue to fix bugs after the “release”. We document the most obvious regressions in the constantly-updated Release Notes. And we handle the Grizzly bugs using the stable release update process.
After release, we maintain a branch where important bugfixes are backported and from which we’ll publish point releases. This stable/grizzly branch is maintained by the OpenStack stable maintenance team. If you see a bugfix that should definitely be backported, you can tag the corresponding bug in Launchpad with the grizzly-backport-potential tag to bring it to the team’s attention. For more information on the stable branches, I invite you to read this wiki page.
Being pumped up again
The post-release depression usually lasts a few days, until I realize that not so many bugs were reported. The quality of the new release is actually always an order of magnitude better than the previous releases, due to 6-month worth of improvements in our amazing continuous integration system ! We actually did an incredible job, and it will only get better !
The final stage of recovery is when our fantastic community gets all together at the OpenStack Summit. 4 days to witness and celebrate our success. 4 days to recharge the motivation batteries, brainstorm and discuss what we’ll do over the next 6 months. We are living awesome times. See you there.
Back from a (almost) entirely-offline week vacation, a lot of news were waiting for me. A full book was written. OpenStack projects graduated. An Ubuntu rolling release model was considered. But what grabbed my attention was the announcement of UDS moving to a virtual event. And every 3 months. And over two days. And next week.
As someone who attended all UDSes (but one) since Prague in May 2008, as a Canonical employee then as an upstream developer, that was quite a shock. We all have fond memories and anecdotes of stuff that happened during those Ubuntu developer summits.
What those summits do
For those who never attended one, UDS (and the OpenStack Design Summits that were modeled after them) achieve a lot of goals for a community of open source developers:
- Celebrate recent release, motivate all your developer community for the next 6 months
- Brainstorm early ideas on complex topics, identify key stakeholders to include in further design discussion
- Present an implementation plan for a proposed feature and get feedback from the rest of the community before starting to work on it
- Reduce duplication of effort by getting everyone working on the same type of issues in the same room and around the same beers for a few days
- Meet in informal settings people you usually only interact with online, to get to know them and reduce friction that can build up after too many heated threads
This all sounds very valuable. So why did Canonical decide to suppress UDSes as we knew them, while they were arguably part of their successful community development model ?
Who killed UDS
The reason is that UDS is a very costly event, and it was becoming more and more useless. A lot of Ubuntu development happens within Canonical those days, and UDS sessions gradually shifted from being brainstorming sessions between equal community members to being a formal communication of upcoming features/plans to gather immediate feedback (point  above). There were not so many brainstorming design sessions anymore (point  above, very difficult to do in a virtual setting), with design happening more and more behind Canonical curtains. There is less need to reduce duplication of effort (point  above), with less non-Canonical people starting to implement new things.
Therefore it makes sense to replace it with a less-costly, purely-virtual communication exercise that still perfectly fills point , with the added benefits of running it more often (updating everyone else on status more often), and improving accessibility for remote participants. If you add to the mix a move to rolling releases, it almost makes perfect sense. The problem is, they also get rid of points  and . This will result in a even less motivated developer community, with more tension between Canonical employees and non-Canonical community members.
I’m not convinced that’s the right move. I for one will certainly regret them. But I think I understand the move in light of Canonical’s recent strategy.
What about OpenStack Design Summits ?
Some people have been asking me if OpenStack should move to a similar model. My answer is definitely not.
When Rick Clark imported the UDS model from Ubuntu to OpenStack, it was to fulfill one of the 4 Opens we pledged: Open Design. In OpenStack Design Summits, we openly debate how features should be designed, and empower the developers in the room to make those design decisions. Point  above is therefore essential. In OpenStack we also have a lot of different development groups working in parallel, and making sure we don’t duplicate effort is key to limit friction and make the best use of our resources. So we can’t just pass on point . With more than 200 different developers authoring changes every month, the OpenStack development community is way past Dunbar’s number. Thread after thread, some resentment can build up over time between opposed developers. Get them to informally talk in person over a coffee or a beer, and most issues will be settled. Point  therefore lets us keep a healthy developer community. And finally, with about 20k changes committed per year, OpenStack developers are pretty busy. Having a week to celebrate and recharge motivation batteries every 6 months doesn’t sound like a bad thing. So we’d like to keep point .
So for OpenStack it definitely makes sense to keep our Design Summits the way they are. Running them as a track within the OpenStack Summit allows us to fund them, since there is so much momentum around OpenStack and so many people interested in attending those. We need to keep improving the remote participation options to include developers that unfortunately cannot join us. We need to keep doing it in different locations over the world to foster local participation. But meeting in person every 6 months is an integral part of our success, and we’ll keep doing it.
Next stop is in Portland, from April 15 to April 18. Join us !
In 3 weeks, free and open source software developers will converge to Brussels for 2+ days of talks, discussions and beer. FOSDEM is still the largest gathering for our community in Europe, and it will be a pleasure to meet again with longtime friends. Note that FOSDEM attendance is free as in beer, and requires no registration.
OpenStack will be present with a number of talks in the Cloud devroom in the Chavanne auditorium on Sunday, February 3rd:
- At 9:30, I’ll open the devroom with State of the OpenStack Union, 2013. A talk about what happened in the OpenStack development community since last year presentation at FOSDEM.
- At 10:00, don’t miss Mark McLoughlin’s talk: OpenStack: 21st Century App Architecture and Cloud Operations. He will expose how OpenStack is built with the same resilience and automation principles as highly-scalable cloud applications.
- At 15:00, Rob Clark will detail his Security Priorities for Cloud Developers: the main security challenges OpenStack faces and what we should do about them.
- At 15:30, Tomas Sedovic will introduce Orchestrating complex deployments on OpenStack using Heat. The Heat project is in OpenStack incubation currently so this is a great opportunity to learn more about it.
- Finally to close the day at 16:30, Nick Barcet, Eoghan Glynn and Julien Danjou will storm the stage and introduce the other OpenStack project currently in incubation: Measuring OpenStack: the Ceilometer Project.
There will also be OpenStack mentions in various other talks during the day: Martyn Taylor should demonstrate OpenStack Horizon in conjunction with Aeolus Image Factory at 13:30, and Vangelis Koukis will present Synnefo, which provides OpenStack APIs, at 14:00.
Finally, I’ll also be giving a talk, directed to Python developers, about the OpenStack job market sometimes Sunday in the Python devroom (room K.3.401): Get a Python job, work on OpenStack.
I hope you will join us in the hopefully-not-dead-frozen-this-time and beautiful Brussels !
The first milestone of the OpenStack Grizzly development cycle is just out. What should you expect from it ? What significant new features were added ?
The first milestones in our 6-month development cycles are traditionally not very featureful. That’s because we are just out of the previous release, and still working heavily on bugs (this milestone packs 399 bugfixes !). It’s been only one month since we had our Design Summit, so by the time we formalize its outcome into blueprints and roadmaps, we are just getting started with feature implementation. Nevertheless, it collects a lot of new features and bugfixes that landed in our master branches since mid-September, when we froze features in preparation for the Folsom release.
Keystone is arguably where the most significant changes landed, with a tech preview of the new API version (v3), with policy and RBAC access enabled. A new ActiveDirectory/LDAP identity backend was also introduced, while the auth_token middleware is now shipped with the Python Keystone client.
In addition to fixing 185 bugs, the Nova crew removed nova-volume code entirely (code was kept in Folsom for compatibility reasons, but was marked deprecated). Virtualization drivers no longer directly access the database, as a first step towards completely isolating compute nodes from the database. Snapshots are now supported on raw and LVM disks, in addition to qcow2. On the hypervisors side, the Hyper-V driver grew ConfigDrive v2 support, while the XenServer one can now use BitTorrent as an image delivery mechanism.
The Glance client is no longer copied within Glance server (you can still find it with the Python client library), and the Glance SimpleDB driver reaches feature parity with the SQLAlchemy based one. A number of cleanups were implemented in Cinder, including in volume drivers code layout and API versioning handling. Support for XenAPI storage manager for NFS is back, while the API grew a call to list bootable volumes and a hosts extension to allow service status querying.
The Quantum crew was also quite busy. The Ryu plugin was updated and now features tunnel support. The preparatory work to add advanced services was landed, as well as support for highly-available RabbitMQ queues. Feature parity gap with nova-network was reduced by the introduction of a Security Groups API.
Horizon saw a lot of changes under the hood, including unified configuration. It now supports Nova flavor extra specs. As a first step towards providing cloud admins with more targeted information, a system info panel was added. Oslo (formerly known as openstack-common) also saw a number of improvements. The config module (cfg) was ported to argparse. Common service management code was pushed to the Oslo incubator, as well as a generic policy engine.
That’s only a fraction of what will appear in the final release of Grizzly, scheduled for April 2013. A lot of work was started in the last weeks but will only land in the next milestone. To get a glimpse of what’s coming up, you can follow the Grizzly release status page !
As comparing OpenStack with Linux becomes an increasingly popular exercise, it’s only natural that people and press articles start to ask where the Linus of OpenStack is, or who the Linus of OpenStack should be. This assumes that technical leaders could somehow be appointed in OpenStack. This assumes that the single dictator model is somehow reproducible or even desirable. And this assumes that the current technical leadership in OpenStack is somehow lacking. I think all those three assumptions are wrong.
Like Linux, OpenStack is an Open Innovation project: an independent, common technical playground that is not owned by a single company and where contributors form a meritocracy. Assuming you can somehow appoint leaders in such a setting shows a great ignorance of how those projects actually work. Leaders in an open innovation project don’t derive their authority from their title. They derive their authority from the respect that the other contributors have for them. If they lose this respect, their leadership will be disputed and you’ll face the risk of a fork. Project leaders are not appointed, they are grown. Linus wasn’t appointed, and he didn’t decide one day that he should lead Linux. He grew as the natural leader for this community over time.
Maybe people asking for a Linus of OpenStack like the idea of a single dictator sitting at the top. But that setup is not easily reproduced. Three conditions need to be met: you have to be the founder (or first developer) of the project, your project has to grow sufficiently slowly so that you can gather the undisputed respect of incoming new contributors, and you have to keep your hands deep in technical matters over time (to retain that respect). Linus checked all those boxes. In OpenStack, there were a number of developers involved in it from the start, and the project grew really fast, so a group of leaders emerged, rather than a single undisputed figure.
I’d also argue that the “single leader” model is not really desirable. OpenStack is not a single project, it’s a collection of projects. It’s difficult to find a respected expert in all areas, especially as we grew by including new projects within the OpenStack collection. In addition to that, Linux as a project still struggles with its bus factor of 1 and how it would survive Linus. Organizing your technical leadership in a way that makes it easier for leadership to transition to new figures makes a stronger and more durable community.
Finally, asking for a Linus of OpenStack is somehow implying that the current technical leadership is insufficient, which is at best ignorant, at worse insulting. Linus fills two roles within Linux: the technical lead role (final decision on technical matters, the buck stops here) and the release management role (coordinating the release development cycles and producing releases). OpenStack has project technical leads (“PTLs”) to fill the first role, and a (separate) release manager to fill the second. In addition to that, to solve cross-project issues, we have a Technical Committee (which happens to include the PTLs and release manager).
If you are under the impression that this multi-headed technical leadership might result in non-opiniated choices, think twice. The new governance model establishing the Technical Committee and the full authority of it over all technical matters in OpenStack is only a month old, previously the project (and its governance model) was still owned by a single company. The PTLs and Technical Committee members are highly independent and have the interests of the OpenStack project as their top priority. Most of them actually changed employers over the last year and continued to work on the project.
I think what the press and the pundits actually want is a more visible public figure, that would make stronger design choices, if possible with nice punch lines that would make good quotes. It’s true that the explosive growth of the project did not leave a lot of time so far for technical leaders of OpenStack to engage with the press. It’s true that the OpenStack leadership tends to use friendly words and prefer consensus where possible, which may not result in memorable quotes. But confusing that with weakness is really a mistake. Technical leadership in OpenStack is just fine the way it is, thank you for asking.
Mark’s recent blogpost on Raring community skunkworks got me thinking. I agree it would be unfair to spin this story as Canonical/Ubuntu switching to closed development. I also agree that (as the damage control messaging was quick to point out) inviting some members of the community to participate in closed development projects is actually a step towards more openness rather than a step backwards.
That said, it certainly is making the “closed development” option more official and organized, which is not a step in the right direction in my opinion. It reinforces it as a perfectly valid option, while I would really like it to be an exception for corner cases. So at this point, it may be useful to insist a bit on the benefits of open development, and why dropping them might not be that good of an idea.
Open Development is a transparent way of developing software, where source code, bugs, patches, code reviews, design discussions, meetings happen in the open and are accessible by everyone. “Open Source” is a prerequisite of open development, but you can certainly do open source without doing open development: that’s what I call the Android model and what others call Open behind walls model. You can go further than open development by also doing “Open Design”: letting an open community of equals discuss and define the future features your project will implement, rather than restricting that privilege to a closed group of “core developers”.
Open Development allows you to “release early, release often” and get the testing, QA, feedback of (all) your users. This is actually a good thing, not a bad thing. That feedback will help you catch corner cases, consider issues that you didn’t predict, get outside patches. More importantly, Open Development helps lowering the barrier of entry for contributors to your project. It blurs the line between consumers and producers of the software (no more “us vs. them” mentality), resulting in a much more engaged community. Inviting select individuals to have early access to features before they are unveiled sounds more like a proprietary model beta testing program to me. It won’t give you the amount of direct feedback and variety of contributors that open development gives you. Is the trade-off worth it ?
How much as I dislike the Android model, I understand that the ability for Google to give some select OEMs a bit of advance has some value. Reading Mark’s post though, it seems that the main benefits for Ubuntu are in avoiding early exposure of immature code and get more splash PR effect at release time. I personally think that short-term, the drop in QA due to reduced feedback will offset those benefits, and long-term, the resulting drop in community engagement will also make this a bad trade-off.
In OpenStack, we founded the project on the Four Opens: Open Source, Open Development, Open Design and Open Community. This early decision is what made OpenStack so successful as a community, not the “cloud” hype. Open Development made us very friendly to new developers wanting to participate, and once they experienced Open Design (as exemplified in our Design Summits) they were sold and turned into advocates of our model and our project within their employing companies. Open Development was really instrumental to OpenStack growth and adoption.
In summary, I think Open Development is good because you end up producing better software with a larger and more engaged community of contributors, and if you want to drop that advantage, you better have a very good reason.
Next week our community will gather in always-sunny San Diego for the OpenStack Summit. Our usual Design Summit is now a part of the general event: the Grizzly Design Summit sessions will run over the 4 days of the event ! We start Monday at 9am and finish Thursday at 5:40pm. The schedule is now up at:
This link will only show you the Design Summit sessions. Click here for the complete schedule. Minor scheduling changes may still happen over the next days as people realize they are double-booked, but otherwise it’s pretty final now.
For newcomers, please note that the Design Summit is different from classic conferences or the other tracks of the OpenStack Summit. There are no formal presentations or speakers. The sessions at the Design Summit are open discussions between contributors on a specific development topic for the upcoming development cycle, moderated by a session lead. It is possible to prepare a few slides to introduce the current status and kick-off the discussion, but these should never be formal speaker-to-audience presentations.
I’ll be running the Process topic, which covers the development process and core infrastructure discussions. It runs Wednesday afternoon and all Thursday, and we have a pretty awesome set of stuff to discuss. Hope to see you there!
If you want to talk about something that is not covered elsewhere in the Summit, please note that we’ll have an Unconference room, open from Tuesday to Thursday. You can grab a 40-min slot there to present anything related to OpenStack! In addition to that, we’ll also have 5-min Lightning talks after lunch on Monday-Wednesday… where you can talk about anything you want. There will be a board posted on the Summit floor, first come, first serve
More details about the Grizzly Design Summit can be found on the wiki. See you all soon!
It’s been a long time since my last blog post… I guess that cycle was busier for me than I expected, due to my involvement in the Foundation technical Committee setup.
Anyway, we are now at the end of the 6-month Folsom journey for OpenStack core projects, a ride which involved more than 330 contributors, implementing 185 features and fixing more than 1400 bugs in core projects alone !
At release day -1 we have OpenStack 2012.2 (“Folsom”) release candidates published for all the components:
- OpenStack Compute (Nova), at RC3
- OpenStack Networking (Quantum), at RC3
- OpenStack Identity (Keystone), at RC2
- OpenStack Dashboard (Horizon), at RC2
- OpenStack Block Storage (Cinder), at RC3
- OpenStack Storage (Swift) at version 1.7.4
We are expecting OpenStack Image Service (Glance) RC3 later today !
Unless a critical, last-minute regression is found today in these proposed tarballs, they should form the official OpenStack 2012.2 release tomorrow ! Please take them for a last regression test ride, and don’t hesitate to ping us on IRC (#openstack-dev @ Freenode) or file bugs (tagged folsom-rc-potential) if you think you can convince us to reroll.
This Thursday we will publish our second milestone of the Folsom cycle for Nova. It will include a number of new features, including the one I worked on: a new, more configurable and extensible nova-rootwrap implementation. Here is what you should know about it, depending on whether you’re a Nova user, packager or developer !
The goal of the root wrapper is to allow the nova unprivileged user to run a number of actions as the root user, in the safest manner possible. Historically, Nova used a specific sudoers file listing every command that the nova user was allowed to run, and just used sudo to run that command as root. However this was difficult to maintain (the sudoers file was in packaging), and did not allow for complex filtering of parameters (advanced filters). The rootwrap was designed to solve those issues.
How rootwrap works
Instead of just calling sudo make me a sandwich, Nova calls sudo nova-rootwrap /etc/nova/rootwrap.conf make me a sandwich. A generic sudoers entry lets the nova user run nova-rootwrap as root. nova-rootwrap looks for filter definition directories in its configuration file, and loads command filters from them. Then it checks if the command requested by Nova matches one of those filters, in which case it executes the command (as root). If no filter matches, it denies the request.
The escalation path is fully controlled by the root user. A sudoers entry (owned by root) allows nova to run (as root) a specific rootwrap executable, and only with a specific configuration file (which should be owned by root). nova-rootwrap imports the Python modules it needs from a cleaned (and system-default) PYTHONPATH. The configuration file (also root-owned) points to root-owned filter definition directories, which contain root-owned filters definition files. This chain ensures that the nova user itself is not in control of the configuration or modules used by the nova-rootwrap executable.
Rootwrap for users: Nova configuration
Nova must be configured to use nova-rootwrap as its root_helper. You need to set the following in nova.conf:
root_helper=sudo nova-rootwrap /etc/nova/rootwrap.conf
The configuration file (and executable) used here must match the one defined in the sudoers entry (see below), otherwise the commands will be rejected !
Rootwrap for packagers
Packagers need to make sure that Nova nodes contain a sudoers entry that lets the nova user run nova-rootwrap as root, pointing to the root-owned rootwrap.conf configuration file and allowing any parameter after that:
nova ALL = (root) NOPASSWD: /usr/bin/nova-rootwrap /etc/nova/rootwrap.conf *
Nova looks for a filters_path in rootwrap.conf, which contains the directories it should load filter definition files from. It is recommended that Nova-provided filters files are loaded from /usr/share/nova/rootwrap and extra user filters files are loaded from /etc/nova/rootwrap.d.
Directories defined on this line should all exist, be owned and writeable only by the root user.
Finally, packaging needs to install, for each node, the filters definition file that corresponds to it. You should not install any other filters file on that node, otherwise you would allow extra unneeded commands to be run by nova as root.
The filter file corresponding to the node must be installed in one of the filters_path directories (preferably /usr/share/nova/rootwrap). For example, on compute nodes, you should only have /usr/share/nova/rootwrap/compute.filters. The file should be owned and writeable only by the root user.
All filter definition files can be found in Nova source code under etc/nova/rootwrap.d.
Rootwrap for plug-in writers: adding new run-as-root commands
Plug-in writers may need to have the nova user run additional commands as root. They should use nova.utils.execute(run_as_root=True) to achieve that. They should create their own filter definition file and install it (owned and writeable only by the root user !) into one of the filters_path directories (preferably /etc/nova/rootwrap.d). For example the foobar plugin could define its extra filters in a /etc/nova/rootwrap.d/foobar.filters file.
The format of the filter file is defined here.
Rootwrap for core developers
Adding new run-as-root commands
Core developers may need to have the nova user run additional commands as root. They should use nova.utils.execute(run_as_root=True) to achieve that, and add a filter for the command they need in the corresponding etc/nova/rootwrap.d/ .filters file in Nova’s source code. For example, to add a command that needs to be tun by network nodes, they should modify the etc/nova/rootwrap.d/network.filters file.
The format of the filter file is defined here.
Adding your own filter types
The default filter type, CommandFilter, is pretty basic. It only checks that the command name matches, it does not perform advanced checks on the command arguments. A number of other more command-specific filter types are available, see here.
That said, you can easily define new filter types to further control what exact command you actually allow the nova user to run as root. See nova/rootwrap/filters.py for details.
This documentation, together with a reference section detailing the file formats, is available on the wiki.
How did the OpenStack Bug Triage day we organized yesterday go ? Did organizing an event make a difference ? Here are the results !
Nova has more bugs than all the other core projects combined, and the most slack to clean up. We went from 237 “New” bugs at the beginning of the day and down to 42, a completion rate of 82%. In the mean time we managed to close permanently 86 open bugs over a total of 627:
So the BugTriage day definitely made a difference ! Congrats to all the participants ! It leaves our bug tracker in a lot better shape, and created a momentum around bug triaging and having an up-to-date database of known issues.
The success is even more obvious on smaller projects, with Glance, Keystone and Quantum all managing to complete all BugTriage tasks in the day ! See for example the results for Quantum:
See you all for our next BugDay… which will most likely be a Bug Squashing Day (close as many bugs as possible) shortly after folsom-2.