Detach Pytorch

It is mainly used to create tensors. The created tensor storage is shared with another tensor with no involvement of grad.
Which in return, returns a new tensor which does not have any attachment to the present gradients.
Here there is no need for a gradient which in result to not having any type of gradients.
Here the output does not have any attachment, thus resulting in no gradients. Working of detach

Let us consider a program where it is used and not used.

x=torch.ones(20, requires_grad=True)
y=x**4
z=x**6
i=(y+z).sum()
i.backward()
# printing the grad
print(x.grad)

Here the value of b is equal to a power of 4. Therefore iequals to a power of 4 plus a power of 6( x^4 + x^6 ).
The derivative of the above is 4 in(to 2 power of 3 plus 6 into 2 power of 5(4*2^3 + 6*2^5), whose value equals 224.
The output will be it produces a vector with 20 elements whose value is 224.

Let us consider another example where we use detach:

x=torch.ones(20, requires_grad=True)
y=z**3
z=a.detach()**6
i=(y+z).sum()
i.backward()

Here we can see that value of c is detached from the graph, and thevalue of c is not calculated.
Therefore, the derivative will be 3 into a power of 2,for which the value is 12.
The output will be it produces a vector with 20 elements whose value is 12.

Let us consider another program where detach is used:

a = torch.arange(5., requires_grad=True)
b = a**2
c = a.detach()
c.zero_()
b.sum().backward()
print(a.grad)

From the above program, we will see an error since the data is not correct. If we delete the method 0.zero, the gradient value is printed.
The detach method is not used to create a method directly but in the code the tensor is modified. The tensor is updated. I detach commands.
Using the detach, the gradients are not allowed to share the data since it is blocked. But no copies are created using the detach method.
In the computational graph the detach is used when the tensor is not needed.

Detach Method in Pytorch

The pytorch needs to track all information related to tensors which help to compute the gradients.
When the gradients are not required, the detach method is used to create a view of the same which is in the form of graphs.
The graphs do not have records thatinvolve the results since the tracking operation will be deleted from the graph.
The torchviz package is to be used to see how the gradient is computed with the given tensor.

x=T.ones(5, requires_grad=True)
y=a**4
z=a**6
j = (y+z).sum()
make_dot( j).render(“ attached", format="jpg")

Here the operations will not be tracked.

y=x**4
z=x.detach()**6
j=(y+z).sum()
make_dot(i).render("detached", format="jpg")

The program can not track c**6. This is the working of detach method in python.

Example of detach in pytorch:

import torch
def storagespace(x,y):
if x.storage().data_ptr()==y.storage().data_ptr():
print("it is the same storage space")
else:
print("it is different storage space")
a = torch.ones((4,5), requires_grad=True)
print(p)
b = a
c= p.data
d = p.detach()
e = p.data.clone()
f = p.clone()
g = p.detach().clone()
h = torch.empty_like(p).copy_(a)
k = torch.tensor(a)

To copy the contructs we can use sourceTensor.clone( ).detach( ).

Program:

print("a",end='');samestorage(a,a)
print("b:",end='');samestorage(a,b)
print("c:",end='');samestorage(a,c)
print("d:",end='');samestorage(a,d)
print("e:",end='');samestorage(a,e)
print("f:",end='');samestorage(a,f)
print("g:",end='');samestorage(a,g)
print("h:",end='');samestorage(a,h)

The output will be it shows whether it is the same or different.
We can add methods to code the pytorch has 100 constructors.

import torch
import perfplot
perfplot.show(
setup=lambda l: torch.randn(l),
kernels=[
lambda a: a.new_tensor(a),
lambda a: a.clone().detach(),
lambda a: torch.empty_like(a).copy_(a),
lambda a: torch.tensor(a),
lambda a: a.detach().clone(),
],
labels=["new_tensor()", "clone().detach()", "empty_like().copy()", "tensor()", "detach().clone()"],
l_range=[3 ** i for i in range(30)],
alabel="len(a)",
loga=False,
logb=False,
title='Comparison for timing related to PyTorch tensor,
)

We should not use the clone method since the gradient will be propagated to the closed tensor.

Conclusion

A clone should be used with the detach when we want to copy a tensor and detach from the computational graph.
We should be aware of the process of detaching the computational graph since the code is not complicated always.

PyTorch Tutorial

PyTorch Tutorial

Detach Pytorch

Detach Method in Pytorch

Conclusion