Recent posts from Jeff Turner

Jeff Turner

A lot of technical support work is basically pattern-matching, and memorizing. You see a stacktrace, remember vaguely having seen it before, and hunt around until you find the earlier case. Here is a typical example, with MySQL breaking with a weird "Data too long for column" error:

issue.png

To automate the process of "remembering and looking up" stuff like this, I have hacked together a greasemonkey script, supportlink.user.js, that highlights and links any text that matches a set of patterns. With this installed, the error is highlighted, linking to a previous instance of the problem, with a rollover describing the significance:

issue_supportlinked.png

We can also link interesting text in log files (eg. stacktraces). Here we highlight that this person is using GCJ, which often breaks JIRA - a fact that newcomers on support might not know, and so inconspicuous in the logs that it could be easily missed by those who do:

logs.png

How it works

The script is based on Jesse Ruderman's autolink script. After the page loads, the script traverses the HTML DOM looking for text matching certain regular expressions. Any matching text is highlighted and hyperlinked.

Jesse had a static list of regexps and links. My modification was to refresh the list of regexps and links from a central location, in the background (asynchronously), over HTTP (using GM_xmlhttpRequest). The central regexp file is in Subversion at:

http://svn.atlassian.com/svn/public/atlassian/support/supportlink/filters.js

The greasemonkey script is quite well-behaved. It only starts autolinking text once the page is fully loaded (so it will never slow page loads). It fetches the latest regexps (filters.js) in the background, and saves them locally. Once filters.js has been fetched, you can use the script offline (the network failures are logged but don't affect operations). Incidentally this background-loading means that after updating filters.js, you need to reload a test page _twice_ to see an effect.

filters.js consists of entries like:

{
    level: "error",
    name: "MySQL doesn't like control characters in inserted text",
    regexp: /Data too long for column /ig,
    href: function(match) { return
"https://support.atlassian.com/browse/JSP-8147"; }
  },

The fields are:

  • level: Can be "error", "warn" or "info" (or unset). The link styling will be different for each of these. You can see different stylings on the test HTML page.
  • name: The rollover text
  • regexp: Regexp in text to match on
  • href: What to link the matched text to

Installation

  1. Install Greasemonkey
  2. Install the script by clicking on the link http://svn.atlassian.com/svn/public/atlassian/support/supportlink/supportlink.user.js
  3. Go to the test page and make sure it works.

When installing the script, please notice its restrictions. The script will only run on certain URLs. The defaults are to run the script on various Atlassian sites, plus http://localhost/* and http://svn.atlassian.com/* for testing purposes. If you want to use the script in your own setting, you will need to customize the enabled locations, as well as the filter.js download location, and preferably the icon URLs.

Use at Atlassian

At Atlassian, the support staff have been using this script for about 2 months. In this time the filters.js knowledgebase has grown to 70 patterns. Most patterns identify obscure stacktraces, application and error logs, linking them to the relevant support case or bug. Some print warnings or useful advice - for example, if "1.5.0_06" is found, it is linked to a reminder that this JDK version is prone to crashes.

On this experience, I think it is now fair to say this script has proven a genuinely useful tool, rather than just a gimmick. It acts like an extra set of eyes, scanning a page or logs (when loaded in Firefox), pointing out anything interesting, remembering obscure facts for you.

Secondly, because the list of regexps is continually refreshed from the central filters.js knowledgebase, it has become a means of information sharing. Atlassian has support staff in Sydney, Malaysia and San Francisco, and communication between geographical areas is vital. When a new product release comes out, new bugs tend to bite multiple customers in a short period. Now as soon as the bug is diagnosed, we record its signature regexp in filters.js with a link to a bug report, and commit this to Subversion. One screen refresh later, a link pops up in the browsers of other support engineers dealing with similar cases. In this way, knowledge is disseminated between groups.

Improvements

One annoyance is that Subversion's web interface (mod_dav_svn) doesn't set the Is-Modified-Since HTTP header, so filters.js is downloaded in full on each page load. If a regular HTTP server is used instead, the script will only download filters.js when necessary.

Another is that, as Jesse notes in the script, there is a Firefox bug which causes long textarea fields to be truncated when the script runs. For instance, if editing particularly long pages in Confluence, check that the text isn't truncated before you update. This will apparently be fixed in Firefox 3.

Overall, the biggest improvement needed is an easy way to update the knowledgebase with new patterns. Currently one has to checkout a local copy of filters.js from Subversion, add an entry and commit the change. The regexp is sometimes tricky to get right, especially if the pattern spans more than one line. Using Subversion and understanding regexps should be well within the grasp of most support engineers, but it still feels a bit manual. Ideally, one could simply highlight the signature text in Firefox, right-click, and be prompted for the other details. Firefox would then submit the entry to a webapp hosting the knowledgebase. Implementing all this would require quite a bit of work and extra complication, so for now I'm happy with manual updating.

I hope this helps other teams doing web-based technical support. If you make any improvements or find this useful, please let us know in the comments.

Jeff Turner

For those of you maintaining a public JIRA instance, who are suffering from the recent bout of comment spam..

I've created a Confluence space for collaboration on solutions to this, as well as a Subversion repository for utilities:

http://svn.atlassian.com/svn/public/contrib/jira/spamfighting/

Currently this contains:

  • Shell scripts for detecting comment or trackback spam as it happens, and notifying someone. These are intended to be run from a cron job. SQL variants for PostgreSQL and MySQL are included (other translations welcome).
  • A growing blacklist of spammer IPs. All the spam I've seen so far is from three Romanian ISPs, so tracking down the IP and complaining to ISPs is worthwhile. This is where you can help. If you are spammed, please take the time to track down the IP in your logs (instructions provided), register a svn username and commit the evidence to Subversion. The more people who do this, the more evidence ISPs have to act on, or failing that, at least JIRA administrators have a list of IPs to block at the firewall.
  • cleancommentspam.jsp, introduced in a previous post as a way of detecting and deleting spam permanently. If you come up with any improvements or derivatives (a JSP that deletes spammed issue histories would be nice), please contribute back so others can benefit.

To keep up-to-date on future developments, please subscribe to the space feed (atom or RSS).

Jeff Turner

XSLT using too much memory? Try STX

Jeff Turner talks about JIRA November 20, 2006 3:40 PM

The JIRA data anonymizer is a little tool which uses XSLT to anonymize potentially sensitive text in XML backups. Unfortunately it uses up lots of memory, as XSLT works on a DOM tree and so has to load the whole XML document into memory.

We are solving this properly in 3.7 with a built-in anonymizer, but since a customer needed a solution now, I rewrote the anonymizer in STX, an XSLT-like language that is built on a streaming parser (SAX I imagine) instead of DOM. The STX template looks like this:

<?xml version="1.0"?>

<!--
JIRA Data anonymizer

Written in STX (http://stx.sourceforge.net/), an XSLT-like language which uses
relatively little memory.

Copyright (c) 2005-2006 Atlassian
$Revision: 1.3 $
-->
<stx:transform xmlns:stx="http://stx.sourceforge.net/2002/ns" version="1.0" pass-through="none">

  <stx:template match="Action/@body | */@description | Issue/@environment | Issue/@summary | NotificationInstance/@email | ChangeItem/@newstring | ChangeItem/@oldstring | FileAttachment/@filename | NotificationScheme/@name | PermissionScheme/@name | Resolution/@name | CustomFieldValue/@textvalue | OSPropertyText/@value | Project/@url">
    <stx:attribute name="{local-name(.)}"><stx:value-of select="translate(., '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ', 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx')"/></stx:attribute>
  </stx:template>

  <stx:template match="Action/body/text() | */description/text() | Issue/environment/text() | Issue/summary/text() | NotificationInstance/email/text() | ChangeItem/newstring/text() | ChangeItem/oldstring/text() | FileAttachment/filename/text() | NotificationScheme/*/text() | PermissionScheme/*/text() | Resolution/*/text() | CustomFieldValue/textvalue/text() | OSPropertyText/value/text()">
    <stx:value-of select="translate(., '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ', 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx')"/>
  </stx:template>

  <!-- Default rule - copy everything across -->
  <stx:template match="node()|@*" priority="-1">
    <stx:copy>
      <stx:process-attributes />
      <stx:process-children />
    </stx:copy>
  </stx:template>

</stx:transform>

This parses arbitrary-sized XML file in constant memory (as a test I transformed 280Mb with -Xmx32mb in ~2 mins). If you ever need to do a simple text transformation on a large XML file, STX is a very useful little tool.