What is happening with MediaWiki?

There has been some recent active discussion in the MediaWiki (MW) (see “The end of shared hosting” on phabricator) and SemanticMediaWiki (SMW) communities (see this discussion on github) about Service Oriented Architecture (SOA) and the future of MediaWiki.

Part of that discussion revolves around shared hosting which is where many people have deployed their wikis.  I posted some of my thoughts on this for other people in the SMW community, but I don’t want to deprive anyone who loves to read my writing, so I’m copying it here.

I’m not sure I’m the best person to talk about the SOA approach that MW is taking. One thing is clear, though: A SOA approach does not work in shared hosting.

The best place to have this discussion is with Rob Lamphier and the MW architecture committee at the developer summit in January. I encourage anyone interested in this to find their way to San Francisco for that. That said, I do have some thoughts…

While the simpler, nostalgic past has meant that installing a wiki just required access to a single service (typically MySQL) the growing dependence on auxiliary services has made this “download and go” approach harder.

There are several objections to the SOA approach. Here are a few, but please be sure to add to the list:

  • Cost — shared hosting is relatively cheap.
  • Training — shared hosting means someone else is managing the server and you don’t need to have or maintain that skill.
  • Effort — shared hosting allows you to focus on the site, not the running a server.
  • Complexity — related to the previous two, shared hosting means you only need to be concerned with managing one aspect of your site.

These points are addressable, though:

  • Cost should be a relative non-issue. Amazon’s EC2 can be had for free or as little as $10 month. Linode has similar service, a little friendlier UI and other hosters like M5 and Rackspace are providing even cheaper alternatives.
  • I’ve been working on Ansible scripts to set up a server from scratch and James Montalvo and Daren Welsh have been working on Meza to help set up a MW server.
  • During a meeting with the Wikimedia’s Executive Director and leaders of the engineering department that Markus Glaser and I had last year, it was clear they were looking for someone to take a leadership role in developing new forms of distribution for MediaWiki.
  • Software like the Ansible or Meza, combined with new forms of distribution should address the problems of Training, Effort and Complexity.

SOA architecture was first pursued in core MediaWiki with Parsoid‘s node.js implementation. PHP 7 is removing the primary argument that Gabriel Wicke used for writing Parsoid on node.js — speed. Parsoid has a ton of tests and it would make sense to use those tests to rewrite Parsoid in PHP to make deployment easier.There was a really good presentation at Wikimania by Ed Sanders about adapting VisualEditor (VE) for other uses. He pointed to a couple of bits of example code. I haven’t had the chance to look at them yet, but these were apparently clear examples of how to use VE for things in MW besides editing the complete page.I have already been agitating for fewer services. The counter argument is that MW has grown into a monolithic piece of software and using services allows developers to isolate their work and control their interfaces better.That is a good thing, but there is no reason this sort of isolation and control couldn’t be accomplished using the same platform and easy deployment that PHP has offered to many users in the past.

So, what does this all mean for SMW?

I think it is obvious that the status quo isn’t going to work. For one thing, there is already poor communication between the MW and SMW developer communities. As the survey we recently completed clearly shows, many users of MW do not see SMW as a separate or even extra piece of software. I think it is a given that many end users of wiki sites see the functionality that SMW provides as just a normal part of their wiki.

That means we need to get developers who are familiar with SMW to interact with the developers at the Wikimedia Foundation. The developer’s summit would be a good place to start.

Working with the architecture committee — helping them understand the needs of users and developers outside of Wikimedia projects — would help them use their role as leaders of the MW developer community in ways that would help steer MW development so that projects like SMW could continue to depend on the MW platform.

Instead of pining for the past, we need to shape the future by making sure our voices are heard.

MediaWiki Hackathon 2015

I am back from the MediaWiki hackathonRichard Heigl leads a discussion after the #MWStake meeting about actually implmenting our ideas this past weekend.

This is the first time we had some really good participation from non-WMF parties.

A couple of active developers from MITRE, a government-focused NGO, were there. I was also able to get the WMF to pay for a couple of engineers from NASA to go. The organiser of SMWCon Spring 2015 (Chris Koerner) was also there because I encouraged him to apply to get his attendance paid for by the WMF’s scholarship program.

I had planned to spend the hackathon finishing up the HitCounters extension so that we can it would be ready with MediaWiki 1.25 was released. Unfortunately, the conversations with the non-WMF MediaWiki users ended up being too productive. As a result MediaWiki 1.25 was released on Monday without the page view counter functionality. I should have this extension finished by the end of this week.

As an added bonus, I introduced Darren Welsh, one of the engineers from NASA, to the VP of Engineering at the WMF. Our friends at NASA have been doing some really great things to improve the usability and usefulness of a user’s watchlist.  I hope that some of their work shows up on Wikipedia because of this introduction.

Overall, it was a wonderful way for those of us who use MW outside of the Foundation to coordinate our work. I hope to see a lot of good things coming from these sort of meetings in the future.

2014 Summer of Code

Google Summer of Code has ended and, with it, my first chance to mentor a student with Markus Glaser in the process of implementing a new service for MediaWiki users.

At the beginning of the summer, Markus and I worked with Quim Gil to outline the project and find a student to work on it.

Aditya Chaturvedi, a student from the Indian Institute of Technology (“India’s MIT”) saw the project, applied for our mentorship, and, soon after, we began working with him.

We all worked to outline a goal of creating a rating system on WikiApiary with the intention of using a bot to copy the ratings over to MediaWiki.org.

I’m very happy to say that Adiyta’s work can now be seen on WikiApiary. We don’t have the ratings showing up on MediaWiki yet (more on that in a bit) but since that wasn’t a part of the deliverables listed as a success factor for this project, this GSOC project is a success.

As a result of his hard work, the ball is now in our court — Markus and I have to evangelize his ratings and, hopefully, get them displayed on MediaWiki.org.

Unlike some other projects, this project’s intent is to help provide feedback for MediaWiki extensions instead of create a change in how MediaWiki itself behaves. To do this, Aditya and I worked with Jamie Thinglestaad to create a way for users to rate the extensions that they used.

We worked with Jamie for a few reasons. First, Jamie has already created an infrastructure on WikiApiary for surveying MediaWiki sites. He is actively maintaining it and improving the site. Pairing user ratings with current his current usage statistics makes a lot of sense.

Another reason we worked with Jamie instead of trying to deploy any code on a Wikimedia site is that the process of deploying code on WikiApiary only requires Jamie’s approval.

The wisdom of this decision really became apparent at the end when Adiyta requested help getting his ratings to show up using the MediaWiki Extension template.

Thank you, Aditya. It was a pleasure working with you. Your hard work this summer will help to invigorate the ecosystem for MediaWiki extensions.  Good luck on your future endevors.  I hope we can work together again on MediaWiki.

Wikimania 2014 takeaway: Developer Discussion

During this year’s Wikimania, we had several meetings that will affect the work of the release team. One of the most important was the Developer Discussion that I initiated after Daniel Schneider commented that “It might be a good idea to address those issues in some next important dev meeting. The meeting should be chaired by someone from the outside, e.g. from a well respected foundation like Mozilla, Apache.” We had the meeting and Markus Krötzsch, creator of Semantic MediaWiki, agreed to moderate the meeting.

You can read the rough transcript that Markus Glaser kept for us to find out more details about what was discussed, but I’ll synthesize what I think are the important action items and take-aways for the MediaWiki release team.

These fall into three groups: the installation and upgrade process, code deprecation, and, finally, one that overlaps with all of these, a system of certifying extensions.

The effort to provide some standard for extension certification will bear the most fruit because it means providing a standard for MediaWiki that creates a virtuous circle that will lead to improvements in the MediaWiki ecosystem.

It seems obvious to me that we need to start from where we are: a lot of users are using extensions like Semantic MediaWiki and consider those extensions (as Natasha Brown, who runs WikiTranslate.org pointed out) to be an integral part of their site.

This means that we can begin by making a base for certification that covers some things that decently-maintained extensions should have no problem following. This would include things like internationalization and localisation support, including unit tests, and sane integration with the upgrader and installer.

Ideally, none of this would need to be manually tested. This means that to get a rating, your extension would need to be automatically tested. Composer seems like it would help out a lot here. It already provides a framework for fetching code from git – and we should only need to let developers provide a <tt>composer.json</tt> and then we would have a bot (running from Wikimedia Labs) test the package and then provide a certification.

Through this automated certification, we could begin to provide guideance on the “best practices” method of upgrading or handling configuration (probably using Kunal’s configuration work).


The discussion also raised the visibility of Bartosz Dziewoński’s Google Summer of Code work which provides some sanity to MediaWiki’s skinning system. This is a major pain point for many wikis stuck on older versions of MediaWiki – they’ve done a lot of work and customization on appearance and avoid upgrading because the disadvantages of the tight-coupling that MediaWiki’s skinning system has shown until now.

Pasting text into the wiki

A client asked me if I knew of a way to remove the spurious line feeds in a text that you copy from, say, a PDF into the textarea editing box for the wiki. The problem, which you may have seen, is that highlighting a couple of paragraphs of text in the source document and then pasting them into a textarea in your browser will end up with funny, unintended line breaks.

For example, copy the first paragraph of the second section of this paper and then pasting it into a textarea using Firefox gives this result:

The text should be divided into sections, each with a
separate heading and consecutive numbering. Note, how-
ever, that single secondary, tertiary, and quaternary sec-
tions remain unnumbered. Each section heading should be
placed on a separate line using the appropriate L
X com-
mands. For more detailed information on different sections
and their formatting see the Authors’ Guide.

After looking around a bit, I found lots of people talking about pasting problems, but no one offering a solution that would work in the browser instead of just on one particular web page.

After reading about JavaScript and pasting, poking around in the Firefox addons, I figured I knew enough to address the problem with GreaseMonkey.

jsfiddle provided a nice REPL for testing my code and it wasn’t too hard after that to put this together in GreaseMonkey and upload the result to UserScripts.org (after I recovered the password to my long-dormant account).

So, after all that, here is the result of the previous paste:

The text should be divided into sections, each with a separate heading and consecutive numbering. Note, however, that single secondary, tertiary, and quaternary sections remain unnumbered. Each section heading should be placed on a separate line using the appropriate L A T E X commands. For more detailed information on different sections and their formatting see the Authors’ Guide.

Hopefully this will be useful to others.

Hong Kong (and MediaWiki)

Flu in Hong KongIt has been a while since I posted anything here and I’m only posting now to tell you that I’m in Hong Kong.

After Markus Glaser and I won the contract for MediaWiki release management from Wikimedia, he suggested that I come to Wikimania so that we would have a chance to talk to developers.

Markus has been busy in chapter meetings, so I’ve been spending some time talking to developers and getting ready for the MediaWiki architecture discussion later today.

A quick hack to hide a page from anyone who can’t edit it in MediaWiki

Today, a client asked for a way to hide the content of one page from casual browsers. I came up with the following:

$wgHooks['BeforePageDisplay'][] = 'stopDisplay';
function stopDisplay( $output, $skin ) {
        if( 'Passwords galore' === $output->getPageTitle() ) {
                global $wgUser, $wgTitle;
                if( !$wgTitle->userCan( "edit" ) ) {
                        $wgUser->mBlock = new Block( '', 'WikiSysop', 'WikiSysop', 'none', 'indefinite' );
                        $wgUser->mBlockedby = 0;
                        return false;
        return true;

There may be another way to do this and this is certainly not secure against all attempts to read page content. For instance, if you want to hide a Wiki page like [[Passwords_galore]] from people using this technique, all they would have to do is include it using a template to get around this hack: {{:Passwords galore}}.

I’ll be looking at more ways to access the page and more ways to block it soon.