Data Class in Python
Introduction
In Python 3.7, the dataclass modules are introduced as a practical tool for creating organized classes designed specifically for data storage. These classes contain specific attributes and capabilities to deal with data and its representations in particular.
DataClasses in Python3.6, a popular language
Even though the module was first introduced in Python3.7, it is also compatible with Python3.6 when the dataclasses library is installed.
pip install dataclasses
Decorators and classes are used to implement the DataClasses. In order to describe the data type for variables in Python, attributes are defined using Type Hints.
Code
# fundamental Data Class
# dataclass module import
from dataclasses import dataclass
@dataclass
class JTP_Article():
"""a class for storing article contents"""
# Declaring Attributes Using Type Hints
title: str
author: str
language: str
upvotes: int
# object of the data class
art = JTP_Article("Data Class", "Annu Chaudhary", "Python", 0)
print(art)
Output:
JTP_Article(title='Data Classe', author='Annu Chaudhary', language='Python',upvotes=0)
The two notable points in the code above.
- The class received values and allocated them to the proper variables without a __init__() constructor.
- Without any specific function being implemented to do this, the output of a printing object is a clear representation of the data it contains. That indicates that the __repr__() function has been altered.
A built-in __init__() constructor is made available by the dataclass to classes that manage object and data creation on their behalf.
Code:
# fundamental Data Class
# dataclass module import
from dataclasses import dataclass
@dataclass
class JTP_Article():
"""a class for storing article contents"""
# Declaring Attributes Using Type Hints
title: str
author: str
language: str
upvotes: int
# object of the data class
art = JTP_Article ()
print(art)
Output:
Traceback (most recent call last):
File "d:\Programming\Python\test.py", line 20, in <module>
art = JTP_Article ()
TypeError: JTP_Article.__init__() missing 4 required positional arguments: 'title', 'author', 'language', and 'upvotes'
Equality of Data Classes
Since the classes hold data, it is frequently necessary to check two objects to see if they share the same data when using dataclasses. The == operator is used to achieve this.
The code for a class that can store an article but without dataclass decorator is provided below.
Code:
# fundamental Data Class
# dataclass module import
from dataclasses import dataclass
@dataclass
class JTP_Article ():
"""a class for storing article contents"""
# Declaring Attributes Using Type Hints
title: str
author: str
language: str
upvotes: int
class Normal_Article():
"""a class for storing article contents"""
# Equivalent Constructor
def __init_(self, title, author, language, upvotes):
self.title = title
self.author = author
self.language = language
self.upvotes = upvotes
# Two objects of dataclass
Article_1 = JTP_Article("DataClass", "Annu_Chaudhary", "Python", 0)
Article_2 = JTP_Article("DataClass", "Annu_Chaudhary", "Python", 0)
# Two objects of a normal class
article_1 = Normal_Article("DataClass", "Annu_Chaudhary", "Python", 0)
article_2 = Normal_Article("DataClass", "Annu_Chaudhary", "Python", 0)
print ("Data Class Equal:", Article_1 == Article_2)
print ("Normal Class Equal:", article_1 == article_2)
Output:
Data Class Equal: True
Normal Class Equal: False
When two objects are equal in Python, the == operator looks for the same memory location. The outcome for equality is False because two objects require distinct memory locations when they are created. The equality of the data included in each DataClass object is checked. When two DataClass objects with identical data are checked for equality, the output is True in this case.
dataclass() decorator –
@dataclasses.dataclass(*, init=True, repr=True, eq=True, order=False, unsafe_hash=False, frozen=False)
We can alter the functionality and operation of the default constructor created for our DataClasses by altering the values of these parameters.
init: This option indicates whether or not a default constructor should be used.
True (default): A default Constructor shall exist.
False: A default Constructor shall not exist.
Code:
from dataclasses import dataclass
@dataclass(init = False)
class JTP_Article():
title: str
author: str
language: str
upvotes: int
# An object of the dataclass
art = JTP_Article("Data Class", "Annu_Chaudhary", "Python", 0)
Output:
Traceback (most recent call last):
File "d:\Programming\Python\test.py", line 12, in <module>
art = JTP_Article("Data Class", "Annu_Chaudhary", "Python", 0)
TypeError: JTP_Article() takes no arguments
repr: The __repr__() function's behavior is specified by this parameter. The hash value representing the object in memory corresponds to the false value. The object's DataClass representation corresponds to the true value.
Code:
from dataclasses import dataclass
@dataclass (repr = False)
class JTP_Article ():
title: str
author: str
language: str
upvotes: int
# An object of the data class
art = JTP_Article ("DataClasses", "Annu_Chaudhary", "Python", 0)
print (art)
Output:
<__main__.JTP_Article object at 0x00000216E1D675E0>
eq: When two DataClasses are checked for equality with the == or!= operators, this parameter will be used to identify the comparison operation that was carried out. Boolean values are passed to eq.
Code:
from dataclasses import dataclass
@dataclass (repr = False, eq = False)
class JTP_Article():
title: str
author: str
language: str
upvotes: int
# Two objects of data class
Article_1 = JTP_Article ("DataClasses", "vibhu4agarwal", "Python", 0)
Article_2 = JTP_Article ("DataClasses", "vibhu4agarwal", "Python", 0)
equal = Article_1 == Article_2
print ('Classes Equal:', equal)
Output:
Classes Equal: False
When eq is False, two objects are compared using their hashes based on where they are stored in memory, much like two regular objects. The equality of the two items returns False because they have distinct hash representations.
order: When order=True is specified in the input parameter, comparison among two DataClasses is not limited to equality but also supports the >, >=, and = operators.
Based on a comparison of the corresponding qualities of each object, starting with the first one, comparisons between the objects are made.
Code:
from dataclasses import dataclass
@dataclass(order = True)
class Fun ():
var_1: int
var_2: str
var_3: float
obj_1 = Fun (1, "javaTpoint", 7.0)
obj_2 = Fun (2, "javaTpoint", 7.0)
obj_3 = Fun (1, "JTP", 7.0)
obj_4 = Fun (1, "javaTpoint", 8.0)
print (obj_1 > obj_2)
print (obj_1 < obj_3)
print (obj_1 >= obj_4)
Output:
False
False
False
Frozen: This designates all DataClass variables as one-time initializable, meaning that once initialized, they cannot be given a new value. This is comparable to the const keyword in C++ and the final keyword in Java.
Code:
from dataclasses import dataclass
@dataclass (frozen = True)
class JTP_Article():
title: str
author: str
language: str
upvotes: int
Article = JTP_Article("DataClass", "Annu_chaudhary", "Python", 0)
print (Article)
Article.upvotes = 100
print (Article)
Output:
JTP_Article(title='DataClass', author='Annu_chaudhary', language='Python', upvotes=0)
Traceback (most recent call last):
File "d:\Programming\Python\test.py", line 14, in <module>
Article.upvotes = 100
File "<string>", line 4, in __setattr__
dataclasses.FrozenInstanceError: cannot assign to field 'upvotes'
unsafe_hash: In Python, mutable objects are often Unhashable. This means that their hash cannot be produced using Python's hash() method.
DataClass objects are mutable because every class object, including them, has the ability to change its values. They should therefore be unable to generate any hash values.
Code:
from dataclasses import dataclass
@dataclass
class JTP_Article ():
title: str
author: str
language: str
upvotes: int
Article = JTP_Article ("DataClass", "Annu_Chaudhary", "Python", 0)
print (Article)
print (hash (Article))
Output:
JTP_Article(title='DataClass', author='Annu_Chaudhary', language='Python', upvotes=0)
Traceback (most recent call last):
File "d:\Programming\Python\test.py", line 13, in <module>
print (hash (Article))
TypeError: unhashable type: 'JTP_Article'
However, frozen=True makes the object inalterable by initializing the variables only once. This creates a hash for DataClass object in a secure manner.
Code:
from dataclasses import dataclass
@dataclass (frozen = True)
class JTP_Article ():
title: str
author: str
language: str
upvotes: int
Article = JTP_Article ("DataClass", "Annu_Chaudhary", "Python", 0)
print (Article)
print (hash (Article))
Output:
JTP_Article(title='DataClasses', author='Annu_Chaudhary', language='Python', upvotes=0)
6601233036894739447
A DataClass that is still modifiable must generate a hash under the control of unsafe_hash.
When we logically know that we won't change the values of the Dataclass properties after initialization, we utilize this case. But ultimately, the question is: Can they be altered? alternatively, Is the DataClass frozen or not? If the DataClass is not frozen when using unsafe_hash, then DataClass produces an unsafe hash, presuming that the class is frozen, and the programmer must then use this very carefully moving forward.
Code:
from dataclasses import dataclass
@dataclass (unsafe_hash = True)
class JTP_Article ():
title: str
author: str
language: str
upvotes: int
Article = JTP_Article ("DataClass", "Annu_Chaudhary", "Python", 0)
print (Article)
print (hash (Article))
Output:
JTP_Article(title='DataClasses', author='Annu_Chaudhary', language='Python', upvotes=0)
6601233036894739447