Dienstag, 14. August 2012

iOS Multithreading: Thread Safety in iOS Applications

In this post I will exemplify what it means if the objective c code of your iOS application is not thread safe. First, I will cover some basics that will be helpful to understand thread safety. With these basics at hand we will do some experiments to explore the nature of not thread safe code. Finally, I will outline a tool which we use at orderbird to analyze our code for potential thread safety issues.

This post does not cover tools or strategies for multi-threaded programming but will point you to sources that cover this topic.

1) Threading Basics


When your app is started iOS creates a new process and memory is allocated for this app-process. In simplified terms, the memory of an app-process consists of three blocks (A more detailed explanation of the memory layout of C programs can be found here):
The program memory stores the machine instructions your objective c code has been compiled to. Which instruction is executed next is indicated by the Instruction Pointer (IP).

The heap stores objects which are created with [… alloc] init].

The stack is the memory area that is used for method invocations. Methods store things like their parameters and local variables on the stack.

By default an app-process consists of one thread - the main thread. If your iOS app uses multiple threads, all threads share the program memory and the heap but each thread has its own instruction pointer and stack. This means that each thread has its own program flow and if a method is called on one thread, the parameters and local variables cannot be seen be any other thread. But the objects that are created on the heap can be seen, accessed, and manipulated by all threads.

2) Experiment


Now, let us start our little experiment. Open Xcode and create a new project (choose the template "Empty Application"). Create a class named "FooClass" as depicted in the following:

 @interface FooClass {}  
 @property (nonatomic, assign) NSUInteger value;  
 - (void)doIt;  
 @end  
   
 @implementation FooClass  
 @synthesize value;  
   
 - (void)doIt {  
      self.value = 0;  
      for (int i = 0; i < 100000; ++i) {  
           self.value = i;  
      }  
      NSLog(@"after execution: %d (%@)", self.value, [NSThread currentThread]);  
 }  
 @end  

This class has an integer property named value that is incremented 100000 times in the doIt method. At the end of doIt self.value is logged to the console with the information on which thread doIt is executed. In order to execute the doIt method, create a method called _startExperiment in your project's AppDelegate and call this method in its application:didFinishLaunchingWithOptions: method:

 - (void)_startExperiment {  
      FooClass *foo = [[FooClass alloc] init];  
      [foo doIt];  
      [foo release];  
 }  
   
 - (BOOL)application:(UIApplication *)application didFinishLaunchingWithOptions:(NSDictionary *)launchOptions {  
      // …  
      [self _startExperiment];  
      return YES;       
 }  

If we run our experiment by starting the iOS simulator (cmd + R), the _startExperiment method is called, an instance of FooClass is created on the heap and doIt is called on this instance. As expected, the console log (shift + cmd + c) shows 99999 for self.value. Nothing special so far: doIt is invoked on the main thread and it behaves as expected.

3) Thread Safety


Let us execute doIt in parallel on multiple threads:

 - (void)_startExperiment {  
      FooClass *foo = [[FooClass alloc] init];  
      dispatch_queue_t queue = dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0);  
     
   for (int i = 0; i < 4; ++i) {  
     dispatch_async(queue, ^{  
       [foo doIt];  
     });  
   }  
   [foo release];  
 }  

If you execute this multi-threaded version of _startExperiment, you will get an output like this (the concrete values will differ from the output that I am posting):

 after execution: 19851 (NSThread: 0x6b29bd0>{name = (null), num = 3})  
 after execution: 91396 (NSThread: 0x6b298f0>{name = (null), num = 4})  
 after execution: 99999 (NSThread: 0x6a288a0>{name = (null), num = 5})  
 after execution: 99999 (NSThread: 0x6b2a6f0>{name = (null), num = 6})  

Ouch … not on all threads self.value is 99999 as we expected (If your execution produces a correct result - on all threads self.value is 99999 - re-execute it until an incorrect result is produced. It definitely will.).

Why do not all threads produce a correct result? Well, because our code is not thread safe.

Your code is thread safe if it behaves in the same way in a multi-threaded environment as it does in a single-threaded environment.

As we observed above, the method doIt is not thread safe because it does not produce the same result in a multi-threaded environment as it does in a single-threaded environment.

But what is the reason for this behavior? As stated in the beginning, each thread has its own instruction pointer (IP) and stack but all threads share the complete heap. Since the instance of FooClass is allocated on the heap and thus shared among all threads, the threads interfere while executing doIt. Let's take a closer  look at this interference. We consider the execution of the doIt method on two threads Thread1 and Thread2:

The instruction pointer (IP) of Thread1 points to the logging of self.value but did not execute the logging yet. At this point self.value is set to 99999. Now, Thread2 continues executing doIt. Its IP points to the assignment inside the for loop. We assume that self.value is set to 91396 in the for-loop on Thread2. Uups. If Thread1 continues execution, self.value is not set to 99999 anymore but to 91396. Thread1 logs self.value and 91396 is printed. Since doIt does not prevent threads from interfering with each other while executing it, its implementation is not thread safe.

One possible way to make doIt thread safe is to synchronize its body using the @synchronized compiler directive:

 - (void)doIt {  
   @synchronized(self) {  
     self.value = 0;   
     for (int i = 0; i < 100000; ++i) {  
       self.value = i;  
     }  
     NSLog(@"after execution: %d (%@)", self.value, [NSThread currentThread]);       
   }  
 }  

Using the @synchronized directive, each thread gets exclusive access to self in doIt. Note that the threads cannot run in parallel anymore while executing doIt because the @synchronized directive covers the complete method body.

Another way of syncing access to  shared state is to use Grand Central Dispatch (GCD).

 4) How to identify not thread safe code


The experiment that I used to explain thread safety is an oversimplification of reality. In reality, you have already written your code, made some pieces run on background threads and from time to time your app is not behaving as expected. It freezes. It crashes. And you are not able to reproduce these issues.

The main cause for threading issues is shared or global state. Multiple objects access a global variable, share the same object on the heap, or write to the same persistent store. In our little experiment the state that is shared among multiple threads is self respectively self.value. The identification of shared state is quite simple in our experiment but in a real world scenario it is quite hard to go through all your classes and identify methods that manipulate shared or global state.

In order to make things easier I have written a convenient tool that identifies methods that are called from multiple threads. If I have the information which methods are called from multiple threads, I take a closer look at these methods. If such a method manipulates shared or global state, I make up a synchronization strategy for the state that is manipulated by this and other methods. In the following I will outline the core idea of this tool.

The tool consists of four classes: the instances of MultiThreadingAnalysis record calls to methods on a specific thread, the classes ThreadingTrace and MethodExecution represent the result of conducting an analysis with MultiThreadingAnalysis, and the class MultiThreadingAnalysisHook is used to hook into an object and trace all method calls to this object.



















The class MultiThreadingAnalysis offers two methods:
  • recordCallToMethod:ofClass:onThread: which records on which thread a method has been called. 
  • threadingTraceOfLastApplicationRun which should be called after the analysis has finished.

 @interface MultiThreadingAnalysis : NSObject  
   
      - (void)recordCallToMethod:(NSString*)methodName  
                ofClass:(NSString*)className  
               onThread:(NSString*)threadID;  
            
      - (ThreadingTrace*) threadingTraceOfLastApplicationRun;  
            
 @end  

The result of a multi-threading analysis is an instance of ThreadingTrace. It consists of a set of MethodExecution instances each of which represents the execution of a method on a specific thread:

 /*  
  * An instance of this class captures  
  * which methods of which classes have been  
  * called on which threads.  
  */  
 @interface ThreadingTrace : NSObject  
      /*  
       * Set of MethodExecution  
       */  
      @property (nonatomic, readonly) NSSet *methodExecutions;  
      - (void)addMethodExecution:(MethodExecution*)methodExec;  
 @end  
   
 /*  
  * An instance of this class represents a call  
  * to a method of a specific class on a thread  
  * with a specific threadID.  
  */  
 @interface MethodExecution : NSObject  
      @property (nonatomic, copy) NSString *methodName;  
      @property (nonatomic, copy) NSString *className;  
      @property (nonatomic, copy) NSString *threadID;  
 @end  

In order to make the recording of method calls as convenient as possible I am using NSProxy to hook into method calls of an object. The class MultiThreadingAnalysisHook inherits from NSProxy and intercepts all calls to a target object in its forwardInvocation: method. Before forwarding a method call to the target object it records the call by using an instance of MultiThreadingAnalysis.

 @interface MultiThreadingAnalysisHook : NSProxy  
      @property (nonatomic, strong)   id target;  
      @property (nonatomic, strong) MultiThreadingAnalysis *analysis;  
 @end  
   
 @implementation MultiThreadingAnalysisHook  
   
 -(void)forwardInvocation:(NSInvocation*)anInvocation {  
     
   [self.analysis recordCallToMethod:NSStringFromSelector([anInvocation selector])  
                    ofClass:NSStringFromClass([self.target class])  
                onThread:[NSString stringWithFormat:@"%d", [NSThread currentThread]]];  
     
   [anInvocation invokeWithTarget:self.target];  
 }  
 @end  

With the MultiThreadingAnalysisHook at your hands you can hook the multi-threading analysis into your code as proposed in the following. Create a private method _withThreadingAnalysis in the class that you want to analyze. This method creates an instance of MultiThreadingAnalysisHook and sets its target to self. In your designated initializer return the result of invoking _withThreadingAnalysis. The instance of MultiThreadingAnalysisHook will transparently wrap around self and record all calls to self without the need to change any other code outside of the class which you are analyzing.

 @implementation YourClass  
   
 - (id)init {  
      //... do init stuff here  
      return [self _withThreadingAnalysis];  
 }  
   
 - (id)_withThreadingAnalysis {  
   MultiThreadingAnalysisHook *hook =   
     [[MultiThreadingAnalysisHook alloc] init];  
   hook.target = self;  
   return hook;  
 }  
 @end  

After you have installed the MultiThreadingAnalysis via the MultiThreadingAnalysisHook you can call threadingTraceOfLastApplicationRun on MultiThreadingAnalysis to get the trace and analyze respectively visualize it. A simple way of visualizing a trace is to produce a text file from it that looks like this:

begin threading analysis for class FooClass
   method doIt (_MultiThreadAccess_)
   method init (_SingleThreadAccess_)  

If a method is accessed from multiple threads (has the tag _MultiThreadAccess_), you can take a closer look at this method to check if it manipulates shared or global state and implement a suitable thread  synchronization strategy for the manipulated state.

5) Wrap up


Your code is thread safe if it has the same behavior in a multi-threaded environment as it has in a single-threaded one. The reason for code not to be thread safe is the manipulation of shared or global state by multiple threads. Shared or global state can be a globally available persistent store, a singleton that is accessed  from multiple objects, or a global variable. The identification of methods that are accessed from multiple threads can be helpful to discover unsynchronized access to global or shared state and to devise a suitable synchronization strategy. The identification of such methods can be automated by leveraging the interception facilities of NSProxy, the recording of method calls, and the visualization of the recorded method calls.

1 Kommentar:

Unknown hat gesagt…

Really very happy to say, your post is very interesting to read. I never stop myself to say something about it. You’re doing a great job. Keep it up Android mobile App development in USA