The JIRA team has been doing some work to properly support Crowd integration. One of the things we needed to do was to get some of a user's details from Crowd (name, email address) and some from JIRA. The problem is, JIRA uses a numeric ID to access properties, but Crowd just gives us a user name. So, we need to map the user name to an ID.
The first time this happens we need to create a new ID<->Name mapping entry in the database, but it is quite possible, even quite likely in some circumstances that this may happen concurrently. We could simply try and insert them into the database and wait for duplicate key exceptions, but they aren't all that easy to decode across all the databases we support - besides, it feels dirty.
Dushan and Dylan (D&D) were mulling over some solutions with maps of locks and other hairy schemes when they realised that what they really wanted was a map of concurrent operations where each operation will return a value based on the key. They came up with the following interface:
/**
* This will allow you to submit an operation, encapsulated by a
* {@link Callable}, and keyed by an Object <K>, such that the result of the
* Callable will be available to any concurrent callers with the same Object
* key. The result of the Callable is discarded after the operation has
* succeeded.
* <p>
* Essentially, this is a Map whose elements expire very quickly. It is
* particularly useful for situations where you need to get or create the
* result, and you want to prevent concurrent creation of the result.
*/
public interface ConcurrentOperationMap<K, R>
{
/**
* Returns a value that must be computed via an operation. Implementations
* MUST guarantee that no two operations with the same key may be run
* concurrently. It is up to the implementation to decide whether these must
* be run serially, or if the operation is idempotent to borrow the result
* of the currently executing operation.
*
* @param key
* the key, like any map key this should be an immutable object
* that correctly implements {@link #hashCode()} and
* {@link #equals(Object)}
* @param operation
* is the operation to execute whose result will be accessible to
* any concurrent callers with the same key.
* @return result of the operation
* @throws ExecutionException
* if the callable generated a checked exception, otherwise a
* runtime exception or error will be thrown
*/
R runOperation(K key, Callable<R> operation) throws ExecutionException;
}
You submit the operation (a Callable) that does the heavy lifting, and a key. If another thread comes along the implementation should either block and wait before running (if the operation is not idempotent) or block and borrow the result of the current operation if it is.
For our purposes the operation is idempotent so D&D implemented one that borrows the result:
public class BorrowingConcurrentOperationMap<K, R> implements
ConcurrentOperationMap<K, R>
{
private final ConcurrentMap<K, FutureTask<R>> map
= new ConcurrentHashMap<K, FutureTask<R>>();
public R runOperation(K key, Callable<R> operation)
throws ExecutionException
{
FutureTask<R> task = map.get(key);
while (task == null)
{
map.putIfAbsent(key, new FutureTask<R>(operation));
task = map.get(key);
}
R result = runAndGet(task);
map.remove(key);
return result;
}
R runAndGet(FutureTask<R> task) throws ExecutionException
{
try
{
// Concurrent calls to run do not matter as run will be a
// no-op if already running
task.run();
return task.get();
}
catch (VariousExceptions handleTheseNicely)
{
...
}
}
}
Any concurrent calls to runOperation() will get the current FutureTask. If we are unlucky and someone puts one in in between, we fail on the map.putIfAbsent(...) and if we are unlucky and someone removes it in between that failure and the get we loop and put a new FutureTask into the map.
The whole thing effectively operates as a kind of high eviction rate cache. It could potentially slash the number of expensive lookup database queries when they get hot-spotted.
Its pretty simple - but kind of elegant - I like it!
PS. Particular thumbs to Doug Lea et al. for the awesome util.concurrent stuff and the backport that means we can use it in 1.4. Also if you haven't read Java Concurrency in Practice yet, do - its awesome.
Tags: concurrency


4 Comment(s)
You should also check out ReferenceCache. By comparison, it also enables you to keep weak or soft references to keys and/or values, and it doesn't require you to pass in a Callable with every lookup.
By Bob Lee at February 13, 2007 10:19 PM
Cool, I had seen that package before - there's some very nice code in there - but hadn't looked into the ReferenceCache class. Do you have a 1.4 backport?
The code shown here was generified for presentation, we are stuck supporting WebSphere and therefore 1.4...
By Jed Wesley-Smith at February 13, 2007 11:14 PM
No backport. I've been using 1.5 since it came out. I think we use Retroweaver on Struts though.
By Bob Lee at February 14, 2007 10:26 AM
Looked at the ReferenceCache in more detail, the Soft/Weak reference stuff of the ReferenceMap it inherits from is awesomely cool.
That being said, the implementation of the atomic guarantees of the AbstractReferenceCache is similar in function what we achieve to the ConcurrentOperationMap, but quite a bit more complex and IMHO not as nearly as readable as the ConcurrentOperationMap. Partly this is due to the reimplementation of FutureTask that doesn't encapsulate the actual operation (the Callable for us).
So, our ConcurrentOperationMap is kind of equivalent to a small part of the ReferenceCache implementation.
By Jed Wesley-Smith at February 15, 2007 12:06 AM