In this tutorial, We will learn how to use threading in your Python Programs to make them more responsive and share data between multiple threads.
We also have a Youtube video on Creating and Sharing data between Python threads for the Absolute Beginner
Contents
- Introduction to Python Threading
- How threading works in Python
- Source Codes
- Creating a thread in Python
- Using Thread.join() in Python threading
- Passing Arguments to Python Threads
- Returning Values from Python Thread function
- Using Locks in Python
- Controlling Python Threads with Events
- Using event.wait() in threading
- Setting timeouts in threading.Event
- Using python events to exit from an infinite loop
- Producer-consumer pattern using queue
- Intro to Python Queue data type
- Data sharing between threads using Queue
Introduction to Python Threading
Python threading is different from threading in other languages like Java ,C# or Go where the threads can execute simultaneously on a multicore processor. Due to the design limitations of the Python interpreter(CPython) only one thread can run at a time.
Python threads can be used to speed up operations that are I/O bound with little CPU interaction, For Eg writing to a file ,Downloading data from websites ,reading from serial ports etc.
The threading library can also be used to improve the responsiveness of GUI where a long running task can be spun out of as a separate thread different from the GUI thread.
Python threading will not provide any significant speed up to the operations that are CPU bound, For Eg calculations that tie up the processor for significant amount of time with little interaction with IO. For speeding up CPU bound operation ,it is recommended to use the Multiprocessing module .
How threading works in Python
When you run a python code on command line by typing
python simple_script.py
on Linux or Windows system. The CPython interpreter will read the python code, convert it into byte code and execute inside the CPython process as shown below.
The Process will have at least one thread known as the Main Thread, provided your script is not using threading library to create additional threads.
Now if you are using the threading library to create multiple threads.
#For eg
#partial code
import threading
t1 = threading.Thread(target = function_name)
t1.start()
#create two more threads
the Main thread will spawn extra threads inside your CPython process as shown below. All the threads inside the Process have access to the same global variables.
Please note that CPython can only execute one thread at a time due to the presence of GIL (Global interpreter Lock)
Source Codes
All the Python Threading codes can be downloaded as zip from here
Browse Python Threading codes Github Repo
Creating a thread in Python
To create a thread ,we need to import the threading module which is a part of the standard module. Below code shows the minimum needed to create a thread in Python
import threading #import the threading module
def do_something():
#some statements here
pass
t1 = threading.Thread(target = do_something) #create the thread t1
t1.start() #start the thread
Here we first import the threading module.
Then we define a function which we will run in the thread, here pass is used as a placeholder
def do_something():
#some statements here
pass
then we create a thread t1 using the threading.Thread() method and pass the name of the function which we wnat to run as a thread here do_something().
t1 = threading.Thread(target = do_something) #create the thread t1
then we call the start() method of the thread t1 to run it as shown below
t1.start() #start the thread
This will run the the function on a separate thread an execute what is inside the do_something().
Using Thread.join() in Python threading
In the previous example we learned how to create a simple thread and how to start it .Here we will give the thread function some thing to do and see how it effects the output.
We have modified the do_something() function with some print statements and a small delay to simulate doing some work.
# _1_create_python_threads_no_join.py
import time
import threading #required for threading
def do_something():
print(f'\nEntered do_something() ')
time.sleep(2)
print('\nDone Sleeping in thread\n')
print(f'\nStart of Main Thread ')
t1 = threading.Thread(target = do_something) #create the thread t1
t1.start() #start the threads
print('\nEnd of Main Thread\n+---------------------------+')
We start the thread t1 and print a message "End of the Main Thread...."
The output of running the above program is shown below.
As you can see from the above ,the main thread ended before the ending of thread t1.
The execution of the above code looked like this.
t1.start() statement will start the new thread and returns immediately as it is nonblocking.
After which the Main Thread will run the print statement and display the text "End of Main Thread" on the terminal. At same time the thread t1 is executing in the background concurrently.
After 2 seconds the thread t1 will print "Done sleeping in thread" to the terminal.
Here the two messages are displayed out of order. We would want the message from thread1 to be displayed first and then the message from the main thread ,for that the main thread should wait until the thread t1 has finished executing.
You can use the thread.join() method to force the main thread to wait until all your other threads have completed their work.
# partial code,use the full code from github repo
# _2_create_python_threads.py
...
...
print(f'\nStart of Main Thread ')
t1 = threading.Thread(target = do_something) #create the thread t1
t1.start() #start the threads
t1.join()
print('\nEnd of Main Thread\n+---------------------------+')
Here we have added t1.join() to our code so that the main thread will wait for t1 to finish executing.
Here is the same code running after adding the join() statement.
In our case(above),It doesn't matter whether the text was printed before or after the main thread.
Now I will tell you a hypothetical program that will use thread.join().We have a program that will download three webpages from the internet and join them together into a single page.
Here we can create three threads that will independently download the three webpages concurrently and take varying amount of time depending upon network conditions.
The main thread will wait until all three threads have finished downloading since the join method of all the threads are set in the main thread.
Once all three downloads are finished the data is given to the create_single_webpage() function to create a single webpage.
If join() method is not used the main thread will immediately run the create_single_webpage() function before the downloads are finished and result in error.
Passing Arguments to Python Threads
import time
import threading #required for threading
def do_something(myarg1):
print(myarg1)
time.sleep(2)
t1 = threading.Thread(target = do_something,args = (1,)) #create the thread t1,
#pass a single argument
t1.start() #start the threads
t1.join()
In the above code we are going to pass a single argument to the function do_something() for that we use the args argument of the threading .Thread() function. Here we are passing an integer 1 to the do_something() function.
t1 = threading.Thread(target = do_something,args = (1,)) #Do not forget to add a comma after 1.
Do not forget to add a comma after 1.
You can also pass multiple arguments to your function.
def do_something(myarg1,myarg2,myarg3):
print(myarg1,myarg2,myarg3)
time.sleep(2)
t1 = threading.Thread(target = do_something,args = (1,2,3)) #three arguments are passed here 1,2,3
Returning Values from Python Thread
After you have done some work inside the thread like calculating values or downloading data from the internet, we may need to transfer the data from the worker thread to the main thread or between threads.
The threading.Thread() does not provide any methods to return data from the threads, So we have to resort to indirect ways to get useful data from the thread.
The two main ways of doing it are
- Extend the threading.Thread() class and store the data in instance variable
- Use Global Variables
Using Global Values
Using global value to return data from the threads is the simplest and straight forward way to do it, provided your application is quite simple.
In the below example, We will add two numbers in a thread and return the sum to the main thread using a global variable.
import time
import threading #required for threading
def add_two_numbers(no1,no2):
global global_sum
global_sum = no1 + no2
#time.sleep(2)
global_sum = 0 # Global Value used to return data from thread
print(f'Sum Intial Value -> {global_sum}')
t1 = threading.Thread(target = add_two_numbers,args=(5,10)) # create the thread t1,add 5 and 10
t1.start() # start the threads
t1.join()
print(f'Sum After Calculation-> {global_sum}')
Output of the above code
Sum Intial Value -> 0
Sum After Calculation-> 15
Here we are declaring a global variable named "global_sum "to store the Sum.
The function add_two_numbers(no1,no2) calculates the sum(5+10=15) inside the thread and stores Sum in the global variable "global_sum "
The main thread then prints out the global variable global_sum (15)
You can also use a List or Dictionary to send information between two threads .Here we append a list with a string inside the thread. Here we are directly accessing the global variable from the thread.
#Returning Values from Thread functions using a List
#Here we are directly accessing the global variable from the thread.
import time
import threading
def append_list_thread():
global_list.append('Appended in append_list_thread()')
global_list = [] #Create empty global list
global_list.append('Appended in Main Thread')
print(global_list) #before calling the thread
t1 = threading.Thread(target = append_list_thread)
t1.start()
t1.join()
print(global_list) #after calling the thread
Output of the above code
['Appended in Main Thread']
['Appended in Main Thread', 'Appended in append_list_thread()']
In the above code we are accessing the shared variable directly, You can also pass the global_list as an argument through the threading.Thread(target,args) as shown below
#partial code
def append_list_thread(list_to_be_appended):
list_to_be_appended.append('Appended in append_list_thread()')
global_list = [] # Create empty global list
t1 = threading.Thread(target = append_list_thread,args = (global_list,) ) #global_list as an argument,instead of direct accessing it
Synchronizing shared variables using Locks in Python
In the above example ,only one thread was accessing the shared variable at a time in a clearly defined manner. Now consider a condition in which a single shared variable is being accessed by two threads at the same time. One thread tries to increment the shared variable but the other tries to decrement the shared value at the same time. This may result in data in the variable becoming corrupt.
The section of the code containing a shared resource that is being accessed by two or more threads is called Critical Section .
To prevent the threads from accessing the shared variable at the same time, we can use a Mutual Exclusion Lock or Mutex.
Mutex's/Locks in Python are available in threading.Lock class.
A lock is in one of two states, “locked” or “unlocked”. It is created in the unlocked state. It has two basic methods,
acquire()
and release().
mylock = threading.Lock()
mylock.acquire()
<critical section which we want to protect>
mylock.release()
You can also use a Python threading.lock() with a context manager as shown below. Here the resources are released automatically .
mylock = threading.Lock()
with mylock:
<critical section which we want to protect>
Now we will write a Python script that will append characters to a shared list from two separate threads t1 and t2 .The code is shown below.
The function def update_list_A(var_list) will upend A's to the list
The function def update_list_B(var_list) will upend B's to the list
The two functions are run in separate thread to populate the List[] with A's and B's in the following order [A,A,A,A,B,B,B,].
We create a lock instance using
lock = threading.Lock() #create a lock
which is used to protect the critical sections using lock.acquire() and lock.release()
#partial code,check github for full code
import ...
def update_list_A(var_list): #function to write A's to the List
print('update_list_A thread called ')
lock.acquire()
for _ in range(10):
var_list.append('A')
time.sleep(0.10)
lock.release()
def update_list_B(var_list): #function to write B's to the List
print('update_list_B thread called ')
lock.acquire()
for _ in range(10):
var_list.append('B')
time.sleep(0.10)
lock.release()
lock = threading.Lock() #create a lock
shared_list =[] #Shared variable to be modified in threads
t1 = threading.Thread(target = update_list_A, args = (shared_list,))
t2 = threading.Thread(target = update_list_B, args = (shared_list,))
t1.start()
t2.start()
...
If we run the above code we will get an output as shown below
update_list_A thread called
update_list_B thread called
['A', 'A', 'A', 'A', 'A', 'A', 'A', 'A', 'A', 'A', 'B', 'B', 'B', 'B', 'B', 'B', 'B', 'B', 'B', 'B']
End of Main Thread
Now the above code contains Locks to protect the critical sections of the append process.
Now let's comment out the locks (lock.acquire() and lock.release()) and run the code.
update_list_A thread called
update_list_B thread called
['A', 'B', 'B', 'A', 'B', 'A', 'B', 'A', 'B', 'A', 'B', 'A', 'B', 'A', 'B', 'A', 'B', 'A', 'A', 'B']
End of Main Thread
another run
update_list_A thread called
update_list_B thread called
['A', 'B', 'A', 'B', 'B', 'A', 'B', 'A', 'B', 'A', 'B', 'A', 'B', 'A', 'B', 'A', 'B', 'A', 'B', 'A']
End of Main Thread
Every time you run the code without locks ,the output will differ.
Running the Python code without thread.lock
After you have started the python script, the main thread will create two threads
t1(update_list_A)
and t2(update_list_B) t
hat will update the shared_list.
The OS scheduler will usually run the thread t1 that will append A's to shared_list variable in a loop.
for _ in range(10):
var_list.append('A')
time.sleep(0.10)
After the first 'A' is appended to the list .The thread will block or go to sleep for 100 milliseconds, when that happens the scheduler will try to run other threads that are ready for eg Thread t2 since t1 is blocked.
If Thread t2 is run by the OS scheduler, it will append a 'B' to shared_list variable and call its time.sleep() function which will block thread t2 and the OS may decide to run another thread that is ready probably Thread t1 or other different threads that are ready.
This repeats again and again until the loop is finished
Running the python code with thread.lock
Here we are running the python code with lock.
Here Main thread will create two threads t1 and t2 as usual.
OS will run thread t1 first ,which will acquire the lock using lock.acquire() to protect the for loop section. The thread will append an 'A' to the shared_list and then go to sleep for some time.
lock.acquire()
for _ in range(10):
var_list.append('A')
time.sleep(0.10)
lock.release()
Since t1 is blocked due to the sleep function, The OS may decide to run another thread for eg t2
t2 will try to access shared_list which is locked by t1.This prevent t2 from writing to shared_list due to the lock by t1.Since t2 is not doing anything OS switches to another thread for eg t1
t1 will complete another iteration of the for loop without releasing the lock.
Once the for loop is completed ,t1 releases the lock using lock.release() and t2 will be able to acquire the lock and write 'B' to the shared_list variable inside its loop.
Starting and Stopping Python Threads with Events
In Python threading, an "event" is a synchronization primitive used to coordinate between multiple threads. It allows one or more threads to wait until a certain condition is set or cleared by other threads.
The threading.Event class in Python provides a simple way to communicate between threads using a flag that can be set or cleared. Here's a brief overview of its main methods:
Event() - This creates a new event object with an initial flag value of False.
set() - Sets the flag of the event to True, allowing threads waiting for this event to proceed.
clear() - Resets the flag of the event to False.
is_set() - Returns True if the flag is set, False otherwise.
wait(timeout=None) - Blocks the current thread until the event is set.
An event can be created using the following code
my_event = threading.Event() # create an Event object
Now by default my_event is not set. You can set the event by calling
my_event.set() #set the event
This can be used to alert the other threads that status of " my_event "has changed and the threads can take action depending upon the change, For eg start or stop some action.
You can also unset the event using
my_event.clear()
Status of the event can be checked using is_set() method which returns True (Boolean) if flag is set otherwise returns False.
Using event.wait() in Python threading
Here we will use the event.wait() method to wait in a thread (t1) until a event flag is set in the main thread as shown in the below image.
In this case the thread t1 is waiting for Main thread to complete some action like waiting for data from an IO device which thread t1 requires. Here t1 thread is started but it waits for the data to be available
Once Main thread has completed its operation i.e. it has received the data. The my_event.set() is set ,which then notifies the thread t1's my_event.wait() function.
my_event.wait() function exits and allows the other code segments to run.
Code is shown below.
#Using event.wait() in Python threading
import ...
def function_t():
print('Entered t1 thread func')
my_event.wait(timeout = 10) # wait for the event to be set in main thread
# timeout of 10 seconds
print('Event is set,so this line gets printed')
my_event = threading.Event() # create an Event object
t1 = threading.Thread(target = function_t) # create t1 thread
t1.start() # start t1 thread
print('will set the event in 5 seconds')
time.sleep(5) #wait 5 seconds
my_event.set() #set the event after 5 seconds
t1.join()
Here thread t1 is started by the main thread but nothing happens in the t1 thread because of the wait function.
After 5 seconds , main thread executes the my_event.set() line and the flag is set.
which makes the my_event.wait(timeout = 10) in thread t1 to unblock and execute the print statements below.
Output of the above code
Entered t1 thread func
will set the event in 5 seconds
Event is set,so this line gets printed
[Finished in 5.4s]
Here my_event.wait(timeout = 10) allows you to specify a time out of 10 seconds. So the wait() will exit after 10 seconds irrespective of the status of the set flag.
Using python events to exit from an infinite loop thread
Here we have an infinite loop thread t1 (infinite_loop_func()) that is reading from a serialport every 1second.
You can use the .is_set(): to make thread t1 quit by setting a flag in the main thread.
#partial code
#
def infinite_loop_func():
print('Thread-t1:Start the loop')
while 1:
if my_event.is_set():
break
print('Thread-t1:Read from Serial Port')
time.sleep(1)
print(f'Thread-t1: my_event.is_set() = {my_event.is_set()}')
t1 = threading.Thread(target = infinite_loop_func) # create t1 thread
t1.start()
time.sleep(5) #wait 5 seconds
my_event.set() #set the event after 5 seconds
Here my_event.set() happens after 5 seconds, which makes the thread t1 quit.
my_event.set() can also be triggered by a button event if you are using a GUI library.
Output of the code
Thread-t1:Start the loop
Thread-t1:Read from Serial Port
Thread-t1:Read from Serial Port
Thread-t1:Read from Serial Port
Thread-t1:Read from Serial Port
Thread-t1:Read from Serial Port
[Event Set in Main Thread]
Thread-t1: my_event.is_set() = True
End of the Main Thread
[Finished in 5.2s]
Implementing producer-consumer pattern using queue in Python
The producer-consumer problem is a classic synchronization problem which involves two types of processes, producers and consumers, that share a common, fixed-size buffer or queue.
Producers generate data items and place them into the buffer, while consumers remove items from the buffer and process them.
There may be one or more producers and one or more consumer tasks operating on the same shared buffer concurrently. Here we will be using a single producer, single consumer pattern.
A producer python thread will generate a range of values from 1to 9 that are then sent to the consumer thread using a thread safe queue.
Introduction to queue data structure in Python
A Queue is a First in First Out (FIFO) data structure that can be used to exchange data between threads in a thread safe concurrent manner. Queue data structure can be found in the queue module of the Python standard module.
import queue
q = queue.Queue()
Please note that queue module also implements other types of queue's like
- queue.LifoQueue - Last In First Out queue
- queue.PriorityQueue
Here we will be using only queue.Queue()
You can add objects to the queue using put() method
q.put(1)
q.put(2)
q.put(3)
and remove objects using get() method.
The objects are removed in the same order as they are added, First in First Out (FIFO).
Data sharing between producer and consumer threads using a queue
Here we have two threads named producer and consumer.
Producer thread def producer(shared_buffer): will generate a series of numbers that are send to the consumer thread def consumer(shared_buffer): through a queue named shared_buffer
Creation of the queue
shared_buffer = queue.Queue() #create a thread safe queue
Producer function
def producer(shared_buffer):
for i in range(10):
shared_buffer.put(i)
time.sleep(1)
shared_buffer.put(None)
A series of numbers 0-9 are generated by the for loop and are added to the FIFO queue using shared_buffer.put(i). After that a None is send to indicate the end of transmission.
Consumer function
def consumer(shared_buffer):
while True :
rxed_data = shared_buffer.get()
if rxed_data == None:
break
print(rxed_data)
Consumer function will take data from the shared_buffer in the same order in which they are send in using shared_buffer.get(). A check for the reception of none is done inside the infinite loop using a if condition.
Partial code can be found below .
import ...
def producer(shared_buffer):
for i in range(10):
shared_buffer.put(i)
#time.sleep(1)
shared_buffer.put(None)
def consumer(shared_buffer):
while True :
rxed_data = shared_buffer.get()
if rxed_data == None:
break
print(rxed_data)
shared_buffer = queue.Queue() #create a thread safe queue
t1 = threading.Thread(target = producer,args = (shared_buffer,))
t2 = threading.Thread(target = consumer,args = (shared_buffer,))
t1.start()
t2.start()
t1.join()
t1.join()
- Log in to post comments