With all the work put into Prism, it is sometimes easy to get complacent with your code. While flying on "auto pilot" through a section of code I failed to think before I typed and the result in my application was far from what was expected.
I had created a service module used by most of my other modules. Within that module I have a process that performs a set of tasks on parallel threads. Because the parallel threads all access a shared collection (read only) for parameters in thier
tasks, I lock an object during their execution to ensure no other thread can update the collection until the parallel threads have completed their work. I do not release the lock until all the threads are complete.
Each thread publishes an event that is consumed by the other modules in the app. Depending upon the state and the module, there is a call back to another method the same service that is running the parallel tasks. That other method must
also lock the object before it can perform it's work.
Well, I had subscribed to the event in the modules on the publishers thread using a strong reference. Without digging into the Event Aggregator's source code I am going to assume that the EventAggregator is not creating a new thread (as implied by
the option). This means that my event handler method is called on the same thread as the the parallel task that published the event. As I am sure you probably guessed, I managed to create a deadlock because the call to the service method from the
event handler goes into a wait state waiting for the lock to be released on the object and that lock is never released because the parallel task thread is still running waiting for the lock to release. Because that parallel task thread never finishes,
the parent thread that created the parallel task threads and locked the object is waiting to join all the parallel threads before finishing and releasing the lock.
So instead, I have subscribed to the event with the background thread option, and since I really did not need the strong reference, I am no longer using it.
A few factors that helped create this deadlock were:
- The disconnect between the modules and the service made it easy to overlook the logic involved in the processes being performed
- Coding different parts at different times created tunnel vision on only the part being worked on instead of the entire flow of logic
- Failure to document the processes before coding so these situations are identified before they become a problem
If this was highlighted in any Prism documentation, I have yet to find it. Hopefully this lesson helps others avoid some frustration not only in this eventaggregator scenario, but in multithreading in general.