Tuesday, January 3, 2012

String.intern() and concurrency

Have there been times when you needed some form of locking but could not find an easy way to granularize it? For example lets say you have

Map<Person, Double> salariesMap = Collections.synchronizedMap(new HashMap<Person, Double>());//BigDecimal might be a better value type

void giveRaise(final Person person, double raisePercentage){
    synchronized(salariesMap){
        oldSalary = salariesMap.get(person);
        Double newSalary = (1 + raisePercentage/100.0) * oldSalary;
        salariesMap.put(person, newSalary);
    }
}

Our first attempt at making the locking granular could be:

void giveRaise(final Person person, double raisePercentage){
    synchronized(person){
        Double oldSalary = salariesMap.get(person);
        Double newSalary = (1 + raisePercentage/100.0) * oldSalary;
        salariesMap.put(person, newSalary);
    }
}

Unless there is a guarantee that for any two Person instances p1 and p2, p1.equals(p2) implies p1 == p2, our locking on the Person object (the map key) would be wrong.

Such a guarantee can be obtained if all Person instances are looked up from a pool of unique Persons.

But then again what if an instance was obtained from the pool and a reference to it held long after it was removed from the pool? To avoid such a stale reference, would one now lock the pool itself too?

Map<String, Person> persons = Collections.synchronizedMap(new HashMap<String, Person>());

void giveRaise(String pName, double raisePercentage){
    synchronized(persons){
        final Person p = persons.get(pName);
        synchronized(p) {
            Double oldSalary = salariesMap.get(p);
            Double newSalary = (1 + raisePercentage/100.0) * oldSalary;
            salariesMap.put(person, newSalary);
        }
    }
}

But this beats the whole purpose of locking only on one Person key for granularity!

Luckily such a unique pool is maintained by the JVM itself for strings and the unique instances in the pool can be accessed via String.intern()

Thus we can obtain a granular locking as follows:
void giveRaise(final Person p, double raisePercentage){
    synchronized(p.getName().intern()){
        Double oldSalary = salariesMap.get(p);
        Double newSalary = (1 + raisePercentage/100.0) * oldSalary;
        salariesMap.put(p, newSalary);
    }
}

Note that the use of a ConcurrentMap instead of Collections.synchronizedMap in the above might lead to better parallelism in the calls to put above. Also, we could probably use replace() instead of simple put():
ConcurrentMap<Person, Double> salariesMap = new ConcurrentHashMap<Person, Double>();

void giveRaise(final Person person, double raisePercentage){
    synchronized(person.getName().intern()){
        oldSalary = salariesMap.get(person);
        Double newSalary = (1 + raisePercentage/100.0) * oldSalary;
        if(!salariesMap.replace(person, oldSalary, newSalary)) {
            logger.error("Salary was modified concurrently");
        }
    }
}

Also see comments on http://stackoverflow.com/questions/348985/deadlock-on-synchronized-string-intern

If the "key" is likely to collide with the key of another class of objects then synchronize(key.intern()) will introduce an unnecessary mutual exclusion among completely unrelated code. In such a case one might instead use a WeakHashMap<String, WeakReference<Object>> as a mapping from a string to a lock object. Then by having one such map for each class of objects, the accidental mutual exclusion because of shared keys can be avoided.

1 comment:

  1. I think a more generic mechanism which won't rely on the key strings being random enough to not incur accidental mutex requirement between unrelated threads would be to use lock-striping mechanisms, like the one the java.util.ConcurrentHashMap internally does.

    ReplyDelete