A client asked me if I knew of a way to remove the spurious line feeds in a text that you copy from, say, a PDF into the textarea editing box for the wiki. The problem, which you may have seen, is that highlighting a couple of paragraphs of text in the source document and then pasting them into a textarea in your browser will end up with funny, unintended line breaks.

For example, copy the first paragraph of the second section of this paper and then pasting it into a textarea using Firefox gives this result:

The text should be divided into sections, each with a
separate heading and consecutive numbering. Note, how-
ever, that single secondary, tertiary, and quaternary sec-
tions remain unnumbered. Each section heading should be
placed on a separate line using the appropriate L
A
T
E
X com-
mands. For more detailed information on different sections
and their formatting see the Authors’ Guide.

After looking around a bit, I found lots of people talking about pasting problems, but no one offering a solution that would work in the browser instead of just on one particular web page.

After reading about JavaScript and pasting, poking around in the Firefox addons, I figured I knew enough to address the problem with GreaseMonkey.

jsfiddle provided a nice REPL for testing my code and it wasn’t too hard after that to put this together in GreaseMonkey and upload the result to UserScripts.org (after I recovered the password to my long-dormant account).

So, after all that, here is the result of the previous paste:

The text should be divided into sections, each with a separate heading and consecutive numbering. Note, however, that single secondary, tertiary, and quaternary sections remain unnumbered. Each section heading should be placed on a separate line using the appropriate L A T E X commands. For more detailed information on different sections and their formatting see the Authors’ Guide.

Hopefully this will be useful to others.

Monday, I announced MediaWiki 1.20.0, affirmed a six-month release cycle, and stated a plan for long-term support for the 1.19 series of MediaWiki. This is the first release that has been managed by a non-WMF employee, and I think it bodes well for third party users of MediaWiki.

I’m hoping that by working with Debian and other Linux distributor on 1.19 support, we can make MediaWiki more welcoming to new and old users. For example, by looking at some of the older MediaWiki installations recorded on WikiStats, I contacted a few wikis and encouraged them to upgrade to 1.19, especially some that were running ancient MediaWiki.

Long term support is especially important for people who customize MediaWiki for their own use. Of course, I would encourage anyone who adapts MediaWiki like this to use hooks and, ideally, share their modifications with us. But, as Linus Torvalds says, “reality is complicated”.

So, instead of saying telling users of MediaWiki “If you modify MediaWiki, we can’t help you at all”, I would rather say, “We’re going to support this version for 2 years, but you’re responsible for upgrading to the next release when the time comes.”

This gives people something that they’re able to plan around more easily than something that changes every six months. Using WikiStats, I’ll contact more MediaWiki installations that are out of date, encourage them to upgrade, and let them know how they can be notified of security updates and later long term support updates.

We have a really good tool, but we need to support users who aren’t the Wikimedia Foundation itself better. This is a start that should encourage the users of MediaWiki to keep their installations up-to-date as well as encourage wider use of MediaWiki.

File:MediaWiki_logo_1.svg.pngA week and a half ago, the Platform Engineering Director for Wikimedia clarified how he would like to see volunteers helping with MediaWiki tarball releases.

Instead of doing some other work I had planned for this weekend (Yay, procrastination!), I managed to put together a 1.20 RC tarball and announce it.

If you get a chance to test this, let me know. If you find a bug, file it in bugzilla. Hopefully we’ll have something ready for release in a couple of weeks.

Thread on MWUsersI’ve been checking these older MediaWiki installations and found that some are really good about keeping them spam free, but others seem to have run into trouble. The sites aren’t abandoned, but you can see that they are really struggling against spam.

In order to continue the discussion, I’ve started up a thread on MWUsers. Oh, and Harry Burt reminded me of this survey two years ago and the results there. I should have more stats available soon.

Posted in wmf.

I just completed a survey of about 2100 wiki’s running MediaWiki. The initial list came from S23-Wiki’s WikiStats, but I plan to add to it. Google, for instance, says there are “About 3,070 results” for the search string “inurl:Special:Version mediawiki "Magnus Manske, Brion Vibber"“.

My purpose in doing this is to find out what version of MediaWiki these sites are running. To do that, the easiest thing is to use the API. This way, you can just ask a site to tell you about itself and get back useful information.

Of course, sometimes the version of MediaWiki installed is too old (before 1.8) or has the api disabled and, in these cases, I had to get a bit more creative.

Usually, I could find the information on the Special:Version page.

And still, there were about 200 wikis that I tried to check that were no longer active, had database problems or some other issue.

Of those that I did get a version from, 692 (36%) were running a version marked “1.20wmf1” indicating they are run by Wikimedia; 51 (2.7%) were running a version older than 1.10; 1043 (54%) were running a version older than 1.17.3 or 1.18.2 both released over a month ago on March 22 (a more recent version of each was released last week).

I would be tempted to think that those 51 wikis running an especially old version of MediaWiki were just unmaintained or abandoned. A spot check, though, seems to show that this isn’t the case.

For example, I found one that was running a seven year old(!) version of MediaWiki on an Ubuntu server whose packages had been updated in the past month and whose wiki pages had been modified in the past couple of days. According to some traffic and search ranking sites, it gets thousands of visitors a day.

When a site owner is good about keeping his packages up-to-date and his pages spam-free (as this one was), it doesn’t seem right to not also provide a way to keep his MediaWiki site up-to-date. But even today, the instructions for installing MediaWiki on Ubuntu, push the user pretty hard to use the tarball installation instead of the Ubuntu packages.

We’ve got to do better.

For the first time ever, I replaced a citation needed template in Wikipedia with an actual citation. And, even better, I am pretty sure it is gonna stick.

The fact template was on the Absolute hot article which I came across after while reading The Hour of Our Delight.

This book as turned into a very readable introduction to particle physics for me. In fact, I was reading some parts of it to my son and he asked me some questions (of course) that I couldn’t really answer. “To Wikipedia!” was the natural solution. And I saw that a citation was needed for something I had just read.

Even better, I learned that there is a direct link between Quantum gravity (which was in the book’s quote) and the theory of everything.

So, yeah, I’m recommending The Hour of Our Delight (or in the original French as L’heure de s’enivrer). You should find it and read it if you have any interest in this sort of thing. I’ll probably write a post later about the chapter titled “An Anthropic Principle”.

and what that means in the new world order

If you haven’t heard yet, Encyclopedia Britannica is going to stop printing new editions and focus on their online effort.

My mother is working on a column about this and asked us for any memories of our use of the encyclopedias at home.

For what it’s worth, my son Basil will spend time with his Nook looking at different Wikipedia articles for hours. Encyclopedias encourage this sort of free-form exploration and, with the introduction of hyperlinks, it becomes much more natural.

Research is changing dramatically in the networked age and the New York Times offers us a blog post about ways to research online outside of just Wikipedia.

Wikipedia, alone, offers us a semi-truck sized chunk of information. It isn’t everything, but it is a good start.

From my point of view, a lot of the objections from professional writers to online, crowd-sourced information look like they’re simply using a false appeal to authority.

Our use of authority as a crutch — especially when information is so readily available — can cripple us. If you look at the online Encyclopedia Britannica, you’ll see a marked difference in side-by-side comparisons of articles. For example, compare the articles on the Soviet space shuttle Buran in Encyclopedia Britannica with Wikipedia’s.

The amateurs are winning the race so far.

The resistance I see from professional writers and librarians towards Wikipedia seems to revolve around two issues: Authority and Compensation.

Writers who consider themselves experts (journalists and college professors, for example) thinks Wikipedia should appreciate their finely crafted prose and respect their authority. They don’t like it when self-appointed deletionists blow away an article that they spent a lot of time on. They don’t like it when what they write isn’t immediately given credibility because they’ve been published in peer-reviewed journals.

And then there are those who see writing as something that they should be paid for. Yes, you should be able to live off of your work. But the value of your work goes down when someone else is willing to provide an acceptable replacement for just your work just because they enjoy it.

For example, if I’m an amateur ditch digger and dig ditches because I enjoy it, then, as long as I have other means of getting my basic needs taken care of, I could end up taking work away from the professional ditch diggers.

The world is changing, same as it always has.

Posted in wmf.

This week, I’ve announced the start of a WikiProject for the new Wikimedia Bug Squad. The hope is to find people willing to try make bugs easy to reproduce, fix, and verify the fix.

After some input from Tomasz Finc, I think it would be good to create ways for other important projects (like Tomasz’ Mobile team) have lower barrier of entry for checking, fixing and verifying bugs. I’m looking forward to getting that done.

Posted in wmf.

This week we did our first roll out of MediaWiki 1.19 on some of the smaller project sites. This staged roll out is a great way to find out how you are using the software in ways we didn’t expect and to give you a warning: “Beware! This thing you are doing is going to break!” Of course, I would prefer to avoid that wherever possible, but there are things I can’t control.

So now, I get to say “Beware!”:

Beware!

If are using document.write() in some javascript, whether in a gadget, in your common.js, vector.js, monobook.js or even global.js, you need to change it. In the cases that I saw, people had used a code fragment like the following:

function importAnyScript(lang,family,script) {
document.write('<script type="text/javascript" src="' + 'http://'
        + lang + '.'
        + family + '.org/w/index.php?title='
        + script + '&action=raw&ctype=text/javascript"></script>');

This has to be changed to something like the following:

function importAnyScript(lang,family,script) {
mw.loader.load('//' + lang + '.' + family
        + '.org/w/index.php?title='
        + script + '&action=raw&ctype=text/javascript');
}