- By making a reference variable pointing to None.
#Python - Making an object eligible for garbage collection
#Creating a class
class A:
pass
# Creating an object of A class, referenced by ob
ob = A()
# Setting the reference variable, ob, to None, which will dereference the object it was referring to.
ob = None
In the program just above, first we have created an object of class A, which is referenced by a reference variable ob and next, we have set the same reference variable
point to None. Now let's see a pictorial representation of the result when we execute this program.
As you may see in the diagram above that we have made a reference ob of class A, point to None,
doing so makes the object(small blue circle) in the memory(big red circle) not
referenced by any reference variable anymore(hence the dotted line), which makes
this object eligible for garbage collection.
Now, let us see the other way to make an object eligible for garbage collection.
- By reassigning a reference variable to another object.
#Creating a class
class A:
pass
# Creating an object of A class, referenced by ob1
ob1 = A()
# Creating another object of A class, referenced by ob2
ob2 = A()
#making ob1 refer to the same object referred by ob2
ob1 = ob2;
- In the example, we have made two objects of class A, with each object referenced by a separate reference variable.
- Next, we have made the first reference variable ob1 point to the second object(also referenced by ob2).
- Doing this, made the first object which was earlier referenced by ob1 eligible for garbage collection,
because, no reference variable
is pointing to it anymore. Please refer to the diagram below for ease of understanding.
Advertisement
Automatic Garbage Collection
Garbage collector is in the full control of Python Interpreter, which
uses the following two strategies to decide when to automatically run the garbage collector-
- By reference counting
An object could be referenced by multiple reference variables. In this strategy, all the references to an object are counted
and when an object is not referenced by any references variables i.e. reference count is zero, the garbage collection is automatically executed.
But, relying solely on this strategy for the automatic execution of garbage collection can be an issue.
Consider a scenario when the two list objects are referencing each other and their reference variables are set to None
(in order to make them eligible for garbage collection). This will lead to a problem called reference cycle.
# Reference cycle problem
# Creating first list
list1 = [1 , 2019]
# Creating second list
list2 = [2, 2019]
# Appending list2 to list1
list1.append(list2)
# Appending list1 to list2
list2.append(list1)
# Setting the reference variable, list1, to None, which will dereference the list object it was referring to
list1 = None
# Setting the reference variable, list2, to None, which will dereference the list object it was referring to
list2 = None
As you can see in the program above, the two list objects are not being referenced by any reference variable(as their reference variables are made to point to None),
but, as these objects are referencing to each other, therefore, neither of these objects are eligible for garbage collection
and they will keep holding their space in the memory.
Such problem of reference cycle is quite common in objects such as -
lists, tuples, dictionaries. Note: Due to the problem of reference cycle in this strategy,
the interpreter also follows the next explained strategy to decide on when to run the garbage collection.
- Comparing the threshold value.
There are three generations of objects maintained by the garbage collector. When a new object is assigned a memory, it enters the first(youngest) generation.
When an object survives the clean-up process of garbage collection, it is moved to the next older generation.
Every generation has a specific threshold value associated with it, and when the number of object allocations in the memory
is greater than the threshold value, the garbage collector automatically runs.
We can get the system threshold value of each generation by using the get_threshold() method of gc module and we can
get the total number of objects in each generation by get_count() method of gc module. Let us see an example.
# Python - Getting the threshold value
import gc
# Getting the threshold value
print('Garbage collection threshold value :', gc.get_threshold())
print("Number of objects in each generation : ', gc.get_count())
Garbage collection threshold value : (700, 10, 10)
Number of objects in each generation : (651, 4, 3)
- Here, the default threshold value of the first(youngest) generation is 700, the default threshold value
for each of the next two older generations is 10. It means
when the number of object allocations in the first generation is greater than 700 the automatic garbage collector will automatically
run.
- Next, we have printed the total number of objects in each generation which says that we have 651 objects in the youngest generation, 4 objects
in the next older generation and 3 objects in the oldest generation.
Manual Garbage Collection
We can manually make the garbage collector to run by creating the problem of reference cycle in the memory, but
in the absence of the problem of reference cycle, the time at which garbage collection is performed will be decided by the interpreter.
Note: We could manually execute the garbage collector by calling the collect() method of gc module.
This method looks for the unreachable objects in the memory
which were eligible for garbage collection and returns the number of objects that are dereferenced/deallocated.
- Let us see how to manually execute the garbage collector
in the scenario of a reference cycle.
# Reference cycle problem
import gc
# Creating first list
list1 = [1 , 2019]
# Creating second list
list2 = [2, 2019]
# Appending list2 to list1, to create the reference cycle problem
list1.append(list2)
# Appending list1 to list2, to create the reference cycle problem
list2.append(list1)
# Setting two lists to None, in order to make them eligible for garbage collection
list1 = list2 = None
# Getting the threshold value
print('Garbage collection threshold value :', gc.get_threshold())
# Getting the total number of objects in each generation
print('Number of objects in each generation before garbage collection : ', gc.get_count())
# Show the effect of garbage collection
n = gc.collect()
print('Garbage collector has collected %d objects ' % (n))
# Getting the total number of objects in each generation
print('Number of objects in each generation after garbage collection : ', gc.get_count())
Output
Garbage collection threshold value : (700, 10, 10)
Number of objects in each generation before garbage collection : (649, 4, 3)
Garbage collector has collected 2 objects
Number of objects in each generation after garbage collection : (30, 0, 0)
In the above program, we have first created a problem of reference cycle by creating two list objects which are referencing each other. Next, we have
made their reference variables point to None, which takes them off the memory when the garbage collection runs.
Note : We have also displayed the
total number of object in each generation before and after the execution of garbage collection.
- Manually invoking the garbage collection in the absence of reference cycle problem may or may not run the garbage collector.
Let us see an example.