Improving watchlists for corporate MediaWiki use

I’ve learned, from listening to corporate users of MediaWiki, that watchlists are very important for maintaining the quality of the wiki.

The guys running the EVA wiki at NASA, for example, have done a lot of work on Watchlist Analytics extension to ensure that the articles on their wiki have watchers spread out over the entire wiki.

Installing this extension for a client increased their awareness of the usefulness of watchlists and, since they had been using WhoIsWatching in the past, they asked me to fix a problem they had encountered.

The client takes a little more proactive approach to managing their wiki. It might not seem like “the wiki way” to people who have only used MediaWiki on Wikipedia, but they wanted to use the ability of WhoIsWatching to put pages on editor’s watchlists.

In the past, when they used the page, it showed a list of users and allows those with permission to add the page to anyone’s watchlist. It limited the list to those people who had provided an email address, which made that more manageable.

Since then, I’ve implemented Single Sign On for them and auto-populated their email address from Active Directory. As a result, the number of users with an email address has jumped from a handful to over 10,000.

So, now WhoIsWatching was trying to load thousands of rows and display them all at once on a single page.

It was trying, but the requests were timing out and the page was un-usable.

The extension had other problems. It practiced security-through-obscurity. While you could disable the ability to add pages to other people’s watchlists, the only thing to keep the anyone from adding pages was the fact that its administrative page (or “Special Page” in MediaWiki parlance) was not on the list of special pages. If you knew about the page, you could visit it and add an article you were hard at work on to everyone’s watchlists, thus spamming them with notifications from the wiki of all your changes.

That, and if you visited the special page without providing any arguments, you’d get a cryptic “usage” message.

To address the first problem, I decided to put an auto-complete form on the page so that a user could start typing a username and then MediaWiki would provide a list of matching usernames. I wondered how I would do this until I noticed that the Special:UserRights page was now providing this sort of auto-completion. Adding that functionality was as easy as providing an text field with the class mw-autocomplete-user.

I addressed the security issue by adding a couple of rights that could be given to users through the user interface (instead of by updating the value of global variables in a php file).

Finally, the frosting on the cake was to make WhoIsWatching’s special page useful if you visited it all by itself.

I already knew that the search bar provided auto-completion for article names and, based on what I discovered with mw-autocomplete-user, I thought I might be able to do something similar with page completion.

I was right. Thanks to a bug in a minor skin discovered back in 2012, you can add the class mw-search-input to a text field and works.

I haven’t been aware of all the great auto-completion work that MediaWiki developers like krinkle and MetaMax have been doing, but I’m pleased with what I see. And the improvments that they implemented made adding the features I needed to WhoIsWatching about a thousand percent easier.

Oh, and I did miscellaneous code cleanup and i18n wrangling (with Siebrand’s guidance, naturally). Now many changes sit ready for review.

There are still things I’d like to fix, but those will have to wait.

Image credit: Livrustkammaren (The Royal Armoury) / Erik Lernestål / CC BY-SA [CC BY-SA 3.0 or Public domain], via Wikimedia Commons

Making git and Emacs’ eshell work together

tl;dr

I often end up in eshell.

Sometimes, because I’m running Emacs on Windows (where shells work, but it’s a pain), and sometimes just because.  The problem, until today, was that anytime I would invoke a git command that wanted to call the pager (say, diff), I would see the following annoying message from less:

WARNING: terminal is not fully functional
-  (press RETURN)

This happens because eshell sets $TERM to “dumb”.  It doesn’t try to fool anyone.  It’s dumb.

But, since I’m stubborn and lazy, I just put up with it the stupidity of eshell and the annoyance of git’s invocation of less.  Till today.

Some would say the answer is “Don’t use git in emacs — use magit!” And they’d be right. I do use magit, but commands should work, too.

So, after a spree of productivity yesterday, I woke up today and hit that annoying message again.  I decided to track it down.

I came across this StackOverflow thread. There is a hint there — I didn’t know less could be told to only page in certain cases — but not anything that says “only sometimes use less”.

So I managed to hack something together:

git config --global core.pager '`test "$TERM" = "dumb" && echo cat || echo less`'

I haven’t tried this under Windows, yet, but I’m hoping it works.

Image CC-by-SA: Richard Bartz, Munich Makro Freak

MediaWiki as a community resource

As is only to be expected, Brion asked:

What sort of outcomes are you looking for in such a meeting? Are you looking to meet with engineers about technical issues, or managers to ask about formally committing WMF resources?

I copy-pasted Chris Koerner’s response:

  • People use MediaWiki!
  • How can we bring them into the fold?
  • What is the WMF stance on MediaWiki? Is it part of the mission or a by-product of it?
  • Roadmap roadmap roadmap

But I couldn’t let it stop there, so I went into rant mode.

Continue reading MediaWiki as a community resource

Emacs for MediaWiki

Tyler Romeo wrote:

If I had the time, I would definitely put together some sort of .dir-locals.el for MediaWiki, that way we could make better use of Emacs, since it has a bunch of IDE-like functionality, even if it’s not IDEA-level powerful.

A client wanted to me to help train someone to take over the work for maintaining their MediaWiki installation. As part of that work, they asked for an IDE and, knowing that other MW devs used PHPStorm, I recommended it and they bought a copy for me and the person I was to train.

PHPStorm has “emacs keybindings” but these are just replacements for the CUA keybindings. Somethings that I expected the keybindings to invoke, didn’t. (It’s been a while since I’ve used PHPStorm, so I’ve forgotten the details.)

In any case, I’ve found that a lot of what I wanted from PHPStorm could be implemented in Emacs using the following .dir-locals.el (which I put above my core and extensions checkouts):

((nil . ((flycheck-phpcs-standard .
        "…/mediawiki/codesniffer/MediaWiki")
     (flycheck-phpmd-rulesets .
        ("…/mediawiki/messdetector/phpmd-ruleset.xml"))
     (mode . flycheck)
     (magit-gerrit-ssh-creds . "mah@gerrit.wikimedia.org"))))

The above is in addition to the code-sniffing I already had set up to put Emacs’ php-mode into the MW style.

The one thing that PHPStorm lacked (and where Emacs’ magit excels) is dealing with git submodules. Since I make extensive use of submodules for my MediaWiki work, this set up makes Emacs a much better tool for working with MediaWiki.

Naturally, I won’t claim that what works for me will work for anyone else. I’ve spent 15 years in Emacs every day. I was first exposed to Emacs in the late 80s(!!) so the virus has had a long time to work its way into my psyche and, by now, I’m incurable.

Oh, I see: you are fighting the services war.

This comment in MediaWiki’s task tracker really upset me. Instead of polluting the tracker with my rant, I decided to use my blog for its original intent: a nice rant.

Are insults like this really good to put in phabricator? I’m not fighting a war. If I were, I would be greatly outnumbered.

There are legitimate reasons for the Wikimedia’s engineering and operations teams to move bits of functionality out of the PHP core.

However, MediaWiki is free software and it has other users with other needs besides the Foundation.

For what its worth, the majority of my clients use services outside of PHP. For instance, I’ve set up and packaged Parsoid into RPMs for them and I’m currently working on creating RPMs for PediaPress’s PDF renderer (before I started, I looked at Offline content generator but wasn’t happy with where it was).

But, as a piece of free software, the ability for an individual to deploy mediawiki by themselves without the infrastructure provided by an IT department is vital for its adoption in organisations outside the WMF.

The WMF vision statement reads: “Imagine a world in which every single human being can freely share in the sum of all knowledge. That’s our commitment.”

Making MW free software that more people can can share their bit of the “sum of all knowledge” without relying on the WMF or Wikipedia’s editors to be the gatekeepers.

What is happening with MediaWiki?

There has been some recent active discussion in the MediaWiki (MW) (see “The end of shared hosting” on phabricator) and SemanticMediaWiki (SMW) communities (see this discussion on github) about Service Oriented Architecture (SOA) and the future of MediaWiki.

Part of that discussion revolves around shared hosting which is where many people have deployed their wikis.  I posted some of my thoughts on this for other people in the SMW community, but I don’t want to deprive anyone who loves to read my writing, so I’m copying it here.

I’m not sure I’m the best person to talk about the SOA approach that MW is taking. One thing is clear, though: A SOA approach does not work in shared hosting.

The best place to have this discussion is with Rob Lamphier and the MW architecture committee at the developer summit in January. I encourage anyone interested in this to find their way to San Francisco for that. That said, I do have some thoughts…

While the simpler, nostalgic past has meant that installing a wiki just required access to a single service (typically MySQL) the growing dependence on auxiliary services has made this “download and go” approach harder.

There are several objections to the SOA approach. Here are a few, but please be sure to add to the list:

  • Cost — shared hosting is relatively cheap.
  • Training — shared hosting means someone else is managing the server and you don’t need to have or maintain that skill.
  • Effort — shared hosting allows you to focus on the site, not the running a server.
  • Complexity — related to the previous two, shared hosting means you only need to be concerned with managing one aspect of your site.

These points are addressable, though:

  • Cost should be a relative non-issue. Amazon’s EC2 can be had for free or as little as $10 month. Linode has similar service, a little friendlier UI and other hosters like M5 and Rackspace are providing even cheaper alternatives.
  • I’ve been working on Ansible scripts to set up a server from scratch and James Montalvo and Daren Welsh have been working on Meza to help set up a MW server.
  • During a meeting with the Wikimedia’s Executive Director and leaders of the engineering department that Markus Glaser and I had last year, it was clear they were looking for someone to take a leadership role in developing new forms of distribution for MediaWiki.
  • Software like the Ansible or Meza, combined with new forms of distribution should address the problems of Training, Effort and Complexity.

SOA architecture was first pursued in core MediaWiki with Parsoid‘s node.js implementation. PHP 7 is removing the primary argument that Gabriel Wicke used for writing Parsoid on node.js — speed. Parsoid has a ton of tests and it would make sense to use those tests to rewrite Parsoid in PHP to make deployment easier.There was a really good presentation at Wikimania by Ed Sanders about adapting VisualEditor (VE) for other uses. He pointed to a couple of bits of example code. I haven’t had the chance to look at them yet, but these were apparently clear examples of how to use VE for things in MW besides editing the complete page.I have already been agitating for fewer services. The counter argument is that MW has grown into a monolithic piece of software and using services allows developers to isolate their work and control their interfaces better.That is a good thing, but there is no reason this sort of isolation and control couldn’t be accomplished using the same platform and easy deployment that PHP has offered to many users in the past.

So, what does this all mean for SMW?

I think it is obvious that the status quo isn’t going to work. For one thing, there is already poor communication between the MW and SMW developer communities. As the survey we recently completed clearly shows, many users of MW do not see SMW as a separate or even extra piece of software. I think it is a given that many end users of wiki sites see the functionality that SMW provides as just a normal part of their wiki.

That means we need to get developers who are familiar with SMW to interact with the developers at the Wikimedia Foundation. The developer’s summit would be a good place to start.

Working with the architecture committee — helping them understand the needs of users and developers outside of Wikimedia projects — would help them use their role as leaders of the MW developer community in ways that would help steer MW development so that projects like SMW could continue to depend on the MW platform.

Instead of pining for the past, we need to shape the future by making sure our voices are heard.

Brilliant twist on the advance fee fraud

I got the following in my email today.  This a brilliant twist on the scam:

From: "IMF Agent" <finance@do.daoudata.co.kr>
Reply-To: "IMF Agent" <fmb.agent@gmail.com>

Hello,

 I am IMF agent attached to the World Bank office . I am very much aware of your ordeal towards actualizing the release of your long delayed payment.

 Not very long after the World Bank completed the acquisition process of all pending payments, I discovered that my boss connived with some top officials of the World Bank to divert funds already approved to settle inheritances, email lottery winners and international contractors.

 The World Bank has already given approval for the payment of your fund while they deliberately delay your payment and continue to demand fees from different departments mostly from Africa, the UK , Europe and Asia all in an attempt to enrich their private accounts. I wonder why you haven’t notice all these while. I am Christian and my religion does not permit such.

 Your fund was authorized to be paid to you through the World Bank accredited affiliate with a Key Tested Reference/Claim Code Number, which was supposed to have been issued to you before now.

 Upon your response to this message, I shall guide you and provide you with all you needed to contact the World Bank affiliate who will facilitate the release of your payment.

Truly Yours,

Kelly David
IMF/World Bank Agent 
======================================================================
This e-mail and any attachments may contain information which is confidential, privileged, proprietary or otherwise protected by law. The information is solely intended for the named addressee (or a person responsible for delivering it to the addressee).  If you are not the intended recipient of this message, you are not authorized to read, print, retain, copy or disseminate this message or any part of it. If you have received this e-mail in error, please notify the sender immediately by return e-mail and delete it from your computer.
======================================================================

This email has been checked for viruses by Avast antivirus software.
http://www.avast.com

Thoughts on “The Geography of Thought”

First things first: I wanted to read this book because my wife is Vietnamese. I’ve spent 20 years married to Alexis and had time to observe how she handles things and I handle things. This puts me at a disadvantage. When she moved from Vietnam to New Orleans, she was five or six, so she has a good 14 or 15 years on me in observing differences and navigating her way in a culture that is different than the one she experiences with her immediate family. I wanted to do a little catch up.

She summed it up nicely when I asked if she wanted to read the book: “I’m sure it won’t tell me anything I don’t already know.”

The book’s shortcomings are apparent almost immediately:

I believed all human groups perceive and reason in the same way. … [I wrote a book] titled Human Inference. Not Western Inference (and certainly not American college student inference!), but human inference. The book characterized what I took to the inferential rules that people everywhere use to understand the world…

I’m a little more than a little surprised that someone could become an academic in human psychology and not understand that people from different cultures see the world in remarkably different ways.

Now, that aside, the book does have the some great insights that come from research the author and others have performed.

If you want a 30-second (or less) synopsis of the book, here it is:

People in the West see the world consisting of various parts. Understand how an individual thing acts and you’ll be able to make reliable predictions about how it will act in any situation.

Meanwhile, people in the East see the world more like an interdependent system. Making predictions based on how you’ve seen something (or someone!) act in isolation is foolish since context is the determining factor in what will happen now.

Of course, when you divide the world up this way, you run into the problem of over-simplification. And, as he makes apparent by comparing populations, the Eastern and Western modes of thought are not a binary system. Asian-Americans, people from Hong Kong and sometimes Germans regularly straddle the East/West divide.

The people who are the penultimate Westerners are Americans, followed by Canadians and the British. Meanwhile, Easterners start with the Chinese and Japanese and move from there.

In the first part of the book, he talks about Aristotle and Confucius and how they created two different systems of thought. The origins of these two modes of thinking, then, are presented as Greek vs Chinese.

What interested me was that Western thought later became more clearly Protestant thought – the connection to Greece was in its generation, but not the “best” modern form. In fact, a lot of my exposure to Orthodox thought has shown that there are sometimes more similarities with Eastern religion than with American Christianity.

Overall, even though this book provided some good food for thought and the studies performed were useful, I was put off by the author’s acknowledgement that he was relatively unaware of thse differences until relatively late in his career.

MediaWiki Hackathon 2015

I am back from the MediaWiki hackathonRichard Heigl leads a discussion after the #MWStake meeting about actually implmenting our ideas this past weekend.

This is the first time we had some really good participation from non-WMF parties.

A couple of active developers from MITRE, a government-focused NGO, were there. I was also able to get the WMF to pay for a couple of engineers from NASA to go. The organiser of SMWCon Spring 2015 (Chris Koerner) was also there because I encouraged him to apply to get his attendance paid for by the WMF’s scholarship program.

I had planned to spend the hackathon finishing up the HitCounters extension so that we can it would be ready with MediaWiki 1.25 was released. Unfortunately, the conversations with the non-WMF MediaWiki users ended up being too productive. As a result MediaWiki 1.25 was released on Monday without the page view counter functionality. I should have this extension finished by the end of this week.

As an added bonus, I introduced Darren Welsh, one of the engineers from NASA, to the VP of Engineering at the WMF. Our friends at NASA have been doing some really great things to improve the usability and usefulness of a user’s watchlist.  I hope that some of their work shows up on Wikipedia because of this introduction.

Overall, it was a wonderful way for those of us who use MW outside of the Foundation to coordinate our work. I hope to see a lot of good things coming from these sort of meetings in the future.

Brief review of Farewell to Alms

Since all I can manage are brief reviews with a pointer to the quotes I excerpted to Twitter, here is my take on Farewell to Alms:

I loved it. At least, it made me feel like I was getting enough out of it to keep on reading.  It also helped me get a better view on our modern world and how new it really is.  That, and I gained a much better understanding of how little transport is when it comes to the total cost of goods.  Why does “Made in America” not matter?  Because the real cost to ship to America is practically nothing.

(P.S. I should mention that I read this because Dan Lyke recommended it when I saw him in SF last January.  See? It only took me a year!)