PHP Object Cloning

In our advanced PHP training course we cover object orientation and how it is implemented in PHP. PHP uses the terminology "magic methods" to refer to class methods that other object orientated languages, such as Java, would refer to either as standard class functionality, like constructors, or methods that form part of the base class from which all other classes derive, i.e. the Object class in Java. In PHP magic methods include such class functions as __toString and __clone. 

PHP Object Cloning

Understanding the need for object cloning may be a little difficult at first for a procedural programmer to understand, especially one coming from a PHP background. When variables are assigned in PHP by default it is done via a pass-by-value mechanism. I.E. the variable is given a copy of the "contents" of the expressions which it is assigned. Hence if a variable is assigned the contents of another variable, changing the 2nd variables value does not affect the first variable.

$a=10;
$b=$a;
$b+=10;
echo $a."\n\r";
echo $b;

The output of the above is:

10
20

From the above it is clear that variable $b got a copy of the content of $a and then performs operation on its own copy of the content. i.e. $b and $a do not point to the same content.

PHP Object Assignment

When it comes to assigning variables the result of an object creation operation the variable receives a copy of the "pointer" to the object. ("Pointer" should not be confused with C pointers, although I find thinking of them as pointing to the same location in memory as handy. Another way to think of this is as an alias to the same object or value.). This is different to what happens for variables that are assigned simple values. So if the second variable changes state of the object it references, all variables "pointing" to that object will access the updated state.

Of course this is natural and what one would expect to have happen. Its so natural that many do not fully comprehend this until they need to make a "copy" of an existing object.

class Person {
    public $firstName;
    public $lastName;
}
 
$person1 = new Person();
$person1->firstName="First Name";
$person1->lastName="Last Name";
 
$person2 = $person1;
$person2->firstName="Surprise";
 
var_dump($person1);
var_dump($person2);
Output

object(Person)#1 (2) {
  ["firstName"]=>
  string(8) "Surprise"
  ["lastName"]=>
  string(9) "Last Name"
}
object(Person)#1 (2) {
  ["firstName"]=>
  string(8) "Surprise"
  ["lastName"]=>
  string(9) "Last Name"
}
From the above it can be seen that both variables "point" to the same object. So what happens when you want a copy of an object and not just a copy of the reference to the object? This is where the clone keyword comes in. To create a copy of an object one needs to do the following:
$person3 = clone $person1;
 
$person3->firstName="Total Recall";
var_dump($person1);
var_dump($person3);
 
Output
object(Person)#1 (2) {
  ["firstName"]=>
  string(8) "Surprise"
  ["lastName"]=>
  string(9) "Last Name"
}
object(Person)#2 (2) {
  ["firstName"]=>
  string(12) "Total Recall"
  ["lastName"]=>
  string(9) "Last Name"
}

So the clone keyword creates an independent copy of the target object. But there is a problem with the cloning process.

PHP Clone Shallow Copy Problem

When the clone process runs it creates a shallow copy of the target object. What this means is that all non-object types get their values copied to the member variables of the new object but any variables referencing object simply get the object reference ("pointer") copied. So both object end up pointing to the same objects. The code below illustrates this a bit more clearly.

 

class Person {
    public $firstName;
    public $lastName;
    public $company;
}
 
class Company {
    public $companyName;
}
 
$companyA = new Company();
$companyA->companyName="PHP Training Co";
 
 
$person1 = new Person();
$person1->firstName="First Name";
$person1->lastName="Last Name";
$person1->company= $companyA;
 
$person3 = clone $person1;
$person3->firstName="Total Recall";
$person3->company->companyName="Awesome Co";
 
var_dump($person1);
var_dump($person3);

Output

object(Person)#2 (3) {

  ["firstName"]=>
  string(10) "First Name"
  ["lastName"]=>
  string(9) "Last Name"
  ["company"]=>
  object(Company)#1 (1) {
    ["companyName"]=>
    string(10) "Awesome Co"
  }
}
object(Person)#3 (3) {
  ["firstName"]=>
  string(12) "Total Recall"
  ["lastName"]=>
  string(9) "Last Name"
  ["company"]=>
  object(Company)#1 (1) {
    ["companyName"]=>
    string(10) "Awesome Co"
  }
So the above shows that the copy was not perfect. Each copy now refers to the same instance of the company object. This may be what you want but maybe it is not what you want and you want a complete standalone copy of the original object. To do this you need to implement the __clone magic method.
 

PHP __clone Magic Method - Deep Copy

In the __clone magic method you will need to implement your own copy of the objects you wish to copy. In some cases this may be a more involved process of accessing the original copies member variables and copy those that are required but in this example we can simply use our newly learnt clone keyword to do the dirty work for us.

class Person {
    public $firstName;
    public $lastName;
    public $company;
    
    public function __clone(){
           $this->company = clone $this->company;
    }
    
}
 
class Company {
    public $companyName;
}
 
$companyA = new Company();
$companyA->companyName="PHP Training Co";
 
 
$person1 = new Person();
$person1->firstName="First Name";
$person1->lastName="Last Name";
$person1->company= $companyA;
 
$person3 = clone $person1;
$person3->firstName="Total Recall";
$person3->company->companyName="Awesome Co";
 
var_dump($person1);
var_dump($person3);
 

Output

object(Person)#2 (3) {

  ["firstName"]=>
  string(10) "First Name"
  ["lastName"]=>
  string(9) "Last Name"
  ["company"]=>
  object(Company)#1 (1) {
    ["companyName"]=>
    string(15) "PHP Training Co"
  }
}
object(Person)#3 (3) {
  ["firstName"]=>
  string(12) "Total Recall"
  ["lastName"]=>
  string(9) "Last Name"
  ["company"]=>
  object(Company)#4 (1) {
    ["companyName"]=>
    string(10) "Awesome Co"
  }
 

So the clone magic method allows developers to implement a deep copy of an object when it is cloned. PHP first performs a shallow clone and the calls the clone method on the new object, if it exists, to complete the cloning process. Magic method such as __clone are covered in our advance PHP training course at Jumping Bean.

Comments

IMHO what you're describing has nothing to do with OO, but simply by not understanding the difference between value and reference. Yes, you create a clone of the person. Yes, it holds a reference to the company. Yes, when you're following a pointer to the storage space and you change it, you change the name of the company.

Think of a "field" as an offset. In your case, "firstName" has an offset of 0, "lastName has an offset of 10, "company" (a pointer BTW) has an offset of 19. Company has one field with an obvious offset of 0.

Let's say the address of Person is #FF00 and the address of Company is #FFFF.
Person -> firstName is translated to #FF00 (offset 0). Person -> lastName is translated to #FF0A (#FF00 + #0A) and Person -> Company to #FF13. If you get the value from #FF13, you get #FFFF. Now, Person -> Company -> companyName is translated to ValueOf(#FF13) + 0 => #FFFF + 0 => #FFFF.

BTW, since you have ONLY ONE Company object, what do you expect?!!

Mark Clarke's picture

Hi Hans,

Not sure what you are disagreeing with? You are explaining the phenomena with a different conceptual model using memory locations but the same results. I don't know how closely this matches up with how PHP actually implements memory allocation.

The point of the article is to highlight that object cloning is a shallow copy and not a deep copy. It also highlights the difference between variables assignment of non-objects, pass-by-value, and variables that are assigned objects, which is by reference.

If you want a complete copy of an object do a deep copy and make sure you clone all the way down the object graph. Of course you may want to do something else with the functionality of the __clone method.

Hope that helps.

I'm quite confused with your "shallow" and "deep" cloning stuff. When you "clone" you allocate the same amount of space as the original object and copy all contents. Here you got two strings (19 chars total) and let's say 4 bytes for the pointer to the object. That's 23 bytes (let's say 1 char = 1 byte). What do you expect that the contents of the pointer to "Company" is? The same!

You "deep" cloning does a cascade of cloning of objects below. That means IF you change the contents of the original object, it only affects one of the copies! I'm not quite sure THAT is what you want.

Worse, if you do "deep" cloning and there is a circular reference in your objects you're gonna waste a lot of memory!

My point is, that at some level this post shows there is little conceptual idea of that is actually modeled and (may be worse) what is happening under the hood if you use "deep" cloning.

In other words: there is nothing "shallow" in the cloning that is performed. You wanted a copy of an object and you got exactly that. That it doesn't "behave" as "expected" has to do with shallow knowledge, not shallow cloning. And if one doesn't understand what he/she is doing, your "solution" may even be worse.

Mark Clarke's picture

Well the reason you would want to do a deep copy is probably the same reason you would want to do a shallow copy except with the added n requirement that you want a complete copy of the object. Perhaps you have never had this requirement in your applications.

This is often the case, for example when you are using value objects or data transfer objects or when you wish to make a defensive copy of an object so some other code doesn't change the object state while you working on it. (This is only a problem in multi threaded applications though so maybe not that big a deal in PHP.)

Alternatively you want to make sure that your api does not allow other functions to change the state of objects you have reference too, so return a copy.

As for the problem with circular reference thats exactly why you have to implement the __clone method yourself. You would write code to detect any such problem.

Your view on the irrelvance of the need for deep copying begs the question -"why is there a clone method in php?" I am sure that you can happily code in PHP without ever needing to do a deep copy if you can avoid it, but that doesn't mean the need doesn't exist.

Another way to be sure to make a deep copy, is to use "serialize" function.
$new_obj = unserialize(serialize($obj));

I think that it is faster than using "__clone()". But you have less control, and the code is not as beautifull as using "__clone".