Shallow copy and deep copy python

What is the difference between shallow copy, deepcopy and normal assignment operation?

Can somebody explain what exactly makes a difference between the copies? Is it something related to mutable & immutable objects? If so, can you please explain it to me?

12 Answers 12

Normal assignment operations will simply point the new variable towards the existing object. The docs explain the difference between shallow and deep copies:

  • A shallow copy constructs a new compound object and then (to the extent possible) inserts references into it to the objects found in the original.
  • A deep copy constructs a new compound object and then, recursively, inserts copies into it of the objects found in the original.

Here’s a little demonstration:

import copy a = [1, 2, 3] b = [4, 5, 6] c = [a, b] 

Using normal assignment operatings to copy:

d = c print id(c) == id(d) # True - d is the same object as c print id(c[0]) == id(d[0]) # True - d[0] is the same object as c[0] 
d = copy.copy(c) print id(c) == id(d) # False - d is now a new object print id(c[0]) == id(d[0]) # True - d[0] is the same object as c[0] 
d = copy.deepcopy(c) print id(c) == id(d) # False - d is now a new object print id(c[0]) == id(d[0]) # False - d[0] is now a new object 

@Dshank No. A shallow copy constructs a new object, while an assignment will simply point the new variable at the existing object. Any changes to the existing object will affect both variables (with assignment).

@grc «Any changes to the existing object will affect both variables (with assignment)» — this statement is true only for mutable objects and not immutable types like string, float, tuples.

Читайте также:  Java instantiate nested classes

@grc But I have tried an example(I remove the new line here.) list_=[[1,2],[3,4]] newlist = list_.copy() list_[0]=[7,8] print(list_) print(newlist) The newlist still display [[1, 2], [3, 4]] . But list_[0] is a list which is mutable.

@Neerav: It’s true for immutables as well. Any changes to an immutable object will show up through both variables, because you can’t change an immutable object — the statement is vacuously true for immutables.

For immutable objects, there is no need for copying because the data will never change, so Python uses the same data; ids are always the same. For mutable objects, since they can potentially change, [shallow] copy creates a new object.

Deep copy is related to nested structures. If you have list of lists, then deepcopy copies the nested lists also, so it is a recursive copy. With just copy, you have a new outer list, but inner lists are references.

Assignment does not copy. It simply sets the reference to the old data. So you need copy to create a new list with the same contents.

With just copy, you have a new outer list but inner lists are references. For the inner lists, would the copied one influenced by original one? I create a list of lists like list_=[[1,2],[3,4]] newlist = list_.copy() list_[0]=[7,8] and the newlist remains the same, so does the inner list are references?

@Stallman you are not changing the referenced list here, just creating a new list and assigning it as the first item of one of the copies. try doing list_[0][0] = 7

partially false, an immutable object must have all subobjects as immutable classes as well; immutability is not deep

@Khlorghaal, do you mean to say that you cannot have an immutable object with mutable members? This is incorrect since for example you can have a tuple of dicts. The tuple will is immutable, the ids of the dicts in tuple is always the same once created. I don’t see what you are finding false about these statements.

For immutable objects, creating a copy doesn’t make much sense since they are not going to change. For mutable objects, assignment , copy and deepcopy behave differently. Let’s talk about each of them with examples.

An assignment operation simply assigns the reference of source to destination, e.g:

>>> i = [1,2,3] >>> j=i >>> hex(id(i)), hex(id(j)) >>> ('0x10296f908', '0x10296f908') #Both addresses are identical 

Now i and j technically refer to the same list. Both i and j have the same memory address. Any updation to one of them will be reflected in the other, e.g:

>>> i.append(4) >>> j >>> [1,2,3,4] #Destination is updated >>> j.append(5) >>> i >>> [1,2,3,4,5] #Source is updated 

On the other hand, copy and deepcopy create a new copy of the variable. So now changes to the original variable will not be reflected in the copy variable and vice versa. However, copy (shallow copy) doesn’t create a copy of nested objects, instead it just copies the references to the nested objects, while deepcopy (deep copy) copies all the nested objects recursively.

Some examples to demonstrate the behaviour of copy and deepcopy :

Flat list example using copy :

>>> import copy >>> i = [1,2,3] >>> j = copy.copy(i) >>> hex(id(i)), hex(id(j)) >>> ('0x102b9b7c8', '0x102971cc8') #Both addresses are different >>> i.append(4) >>> j >>> [1,2,3] #Updation of original list didn't affect the copied variable 

Nested list example using copy :

>>> import copy >>> i = [1,2,3,[4,5]] >>> j = copy.copy(i) >>> hex(id(i)), hex(id(j)) >>> ('0x102b9b7c8', '0x102971cc8') #Both addresses are still different >>> hex(id(i[3])), hex(id(j[3])) >>> ('0x10296f908', '0x10296f908') #Nested lists have the same address >>> i[3].append(6) >>> j >>> [1,2,3,[4,5,6]] #Updation of original nested list updated the copy as well 

Flat list example using deepcopy :

>>> import copy >>> i = [1,2,3] >>> j = copy.deepcopy(i) >>> hex(id(i)), hex(id(j)) >>> ('0x102b9b7c8', '0x102971cc8') #Both addresses are different >>> i.append(4) >>> j >>> [1,2,3] #Updation of original list didn't affect the copied variable 

Nested list example using deepcopy :

>>> import copy >>> i = [1,2,3,[4,5]] >>> j = copy.deepcopy(i) >>> hex(id(i)), hex(id(j)) >>> ('0x102b9b7c8', '0x102971cc8') #Both addresses are still different >>> hex(id(i[3])), hex(id(j[3])) >>> ('0x10296f908', '0x102b9b7c8') #Nested lists have different addresses >>> i[3].append(6) >>> j >>> [1,2,3,[4,5]] #Updation of original nested list didn't affect the copied variable 

Let’s see in a graphical example how the following code is executed:

import copy class Foo(object): def __init__(self): pass a = [Foo(), Foo()] shallow = copy.copy(a) deep = copy.deepcopy(a) 

enter image description here

a, b, c, d, a1, b1, c1 and d1 are references to objects in memory, which are uniquely identified by their ids.

An assignment operation takes a reference to the object in memory and assigns that reference to a new name. c=[1,2,3,4] is an assignment that creates a new list object containing those four integers, and assigns the reference to that object to c . c1=c is an assignment that takes the same reference to the same object and assigns that to c1 . Since the list is mutable, anything that happens to that list will be visible regardless of whether you access it through c or c1 , because they both reference the same object.

c1=copy.copy(c) is a «shallow copy» that creates a new list and assigns the reference to the new list to c1 . c still points to the original list. So, if you modify the list at c1 , the list that c refers to will not change.

The concept of copying is irrelevant to immutable objects like integers and strings. Since you can’t modify those objects, there is never a need to have two copies of the same value in memory at different locations. So integers and strings, and some other objects to which the concept of copying does not apply, are simply reassigned. This is why your examples with a and b result in identical ids.

c1=copy.deepcopy(c) is a «deep copy», but it functions the same as a shallow copy in this example. Deep copies differ from shallow copies in that shallow copies will make a new copy of the object itself, but any references inside that object will not themselves be copied. In your example, your list has only integers inside it (which are immutable), and as previously discussed there is no need to copy those. So the «deep» part of the deep copy does not apply. However, consider this more complex list:

This is a list that contains other lists (you could also describe it as a two-dimensional array).

If you run a «shallow copy» on e , copying it to e1 , you will find that the id of the list changes, but each copy of the list contains references to the same three lists — the lists with integers inside. That means that if you were to do e[0].append(3) , then e would be [[1, 2, 3],[4, 5, 6],[7, 8, 9]] . But e1 would also be [[1, 2, 3],[4, 5, 6],[7, 8, 9]] . On the other hand, if you subsequently did e.append([10, 11, 12]) , e would be [[1, 2, 3],[4, 5, 6],[7, 8, 9],[10, 11, 12]] . But e1 would still be [[1, 2, 3],[4, 5, 6],[7, 8, 9]] . That’s because the outer lists are separate objects that initially each contain three references to three inner lists. If you modify the inner lists, you can see those changes no matter if you are viewing them through one copy or the other. But if you modify one of the outer lists as above, then e contains three references to the original three lists plus one more reference to a new list. And e1 still only contains the original three references.

A ‘deep copy’ would not only duplicate the outer list, but it would also go inside the lists and duplicate the inner lists, so that the two resulting objects do not contain any of the same references (as far as mutable objects are concerned). If the inner lists had further lists (or other objects such as dictionaries) inside of them, they too would be duplicated. That’s the ‘deep’ part of the ‘deep copy’.

Источник

copy — Shallow and deep copy operations¶

Assignment statements in Python do not copy objects, they create bindings between a target and an object. For collections that are mutable or contain mutable items, a copy is sometimes needed so one can change one copy without changing the other. This module provides generic shallow and deep copy operations (explained below).

Return a shallow copy of x.

Raised for module specific errors.

The difference between shallow and deep copying is only relevant for compound objects (objects that contain other objects, like lists or class instances):

  • A shallow copy constructs a new compound object and then (to the extent possible) inserts references into it to the objects found in the original.
  • A deep copy constructs a new compound object and then, recursively, inserts copies into it of the objects found in the original.

Two problems often exist with deep copy operations that don’t exist with shallow copy operations:

  • Recursive objects (compound objects that, directly or indirectly, contain a reference to themselves) may cause a recursive loop.
  • Because deep copy copies everything it may copy too much, such as data which is intended to be shared between copies.

The deepcopy() function avoids these problems by:

  • keeping a memo dictionary of objects already copied during the current copying pass; and
  • letting user-defined classes override the copying operation or the set of components copied.

This module does not copy types like module, method, stack trace, stack frame, file, socket, window, or any similar types. It does “copy” functions and classes (shallow and deeply), by returning the original object unchanged; this is compatible with the way these are treated by the pickle module.

Shallow copies of dictionaries can be made using dict.copy() , and of lists by assigning a slice of the entire list, for example, copied_list = original_list[:] .

Classes can use the same interfaces to control copying that they use to control pickling. See the description of module pickle for information on these methods. In fact, the copy module uses the registered pickle functions from the copyreg module.

In order for a class to define its own copy implementation, it can define special methods __copy__() and __deepcopy__() . The former is called to implement the shallow copy operation; no additional arguments are passed. The latter is called to implement the deep copy operation; it is passed one argument, the memo dictionary. If the __deepcopy__() implementation needs to make a deep copy of a component, it should call the deepcopy() function with the component as first argument and the memo dictionary as second argument. The memo dictionary should be treated as an opaque object.

Discussion of the special methods used to support object state retrieval and restoration.

Источник

Оцените статью