17.11.2007

PHP Namespaces Explained

Namespaces are one of top new features you can expect from the upcoming major PHP versions. There's still a lot of discussion and confusion about how it's going to work in detail, especially because official documentation is scarce and behaviour is subject to change. That being said, in this article I will try to sum up what I could observe playing around with a fresh checkout from PHP's PHP_5_3 CVS branch.

Namespaces are one of top new features you can expect from the upcoming major PHP versions. There's still a lot of discussion and confusion about how it's going to work in detail, especially because official documentation is scarce and behaviour is subject to change. That being said, in this article I will try to sum up what I could observe playing around with a fresh checkout from PHP's PHP_5_3 CVS branch.

Namespaces Really Are Packages

PHP 5.3 brings the new "namespace" statement. When present, this statement must be the first one in a file. Thus, you cannot use multiple namespace statements per file :-) or even nest the statements. This is why PHP namespaces are more like Java's packages than the namespace concept as in, for example, C++. Anyway, I will use the term "namespace" here. Now what does happen?

test1.php

namespace webfactory::NS1; class X {} print get_class(new X()) . "\n";

Basically, the namespace statement declares that all class names you define in that very file belong to the namespace. Namespaces also work for function names, but I won't cover functions any further here. The above example will print "webfactory::NS1::X" as that is what the class X is really called. This is said to be the fully qualified name of the class.

"Defining" a class name means actually providing a class definition. If you just require() a class definition from another file, that class won't end up in your file's namespace. It will be in the namespace set for the file where the class is defined (if any is set).

In test1.php, "webfactory::NS1" is a compound name. You may use "webfactory" as well as "webfactory::NS2" or even "webfactory::NS1::Sub1" as namespaces in other files. In all these files, you can define classes named X and they all will be discriminable because their fully qualified names differ. If you declare a class X in a file without a namespace statement, the fully qualified class name will be "::X" (however, get_class() will just return "X" in that case).

For the rest of this article, let's assume we've got the following two files using the two namespaces webfactory::NS1 and webfactory::NS2:

ns1_x.php

namespace webfactory::NS1; class X {}

ns2_x.php

namespace webfactory::NS2; class X {}

Attention Attention...

Before we proceed, understand that the entire namespace thing is just about naming and referring to classes in a convenient way. It has nothing to do with packaging or loading classes for you. You still will have to use require() and its friends to load your classes. If __autoload() does that job for you, you can probably benefit from tweaking your handler to make use of the namespace names, but basically that changes nothing.

Fully Qualified Names

The most straightforward way of telling PHP what you mean is to refer to classes by their fully qualified names. You can always do that regardless of the namespace you're in or the namespaces you use. Example:

test2.php

require("ns1_x.php"); require("ns2_x.php"); print get_class(new webfactory::NS1::X()) . "\n"; print get_class(new webfactory::NS2::X()) . "\n";

No surpriseses here. This will print "webfactory::NS1::X" and "webfactory::NS2::X", the fully qualified names of the classes.

Prior to having namespaces, this example would have caused an error on the second line as the class name X was already taken. With namespaces, both classes can coexist and by using fully qualified names you are able to tell both apart.

Now that was not all many people have been waiting for so long... You could always have prefixed your class names and have them called webfactory_NS1_X and webfactory_NS2_X respectively to avoid such kind of trouble. But that was often too clumsy to type and read, so many people simply ignored the problem. Namespaces to the rescue!

Using Namespaces

Namespace support also brings the new "use" statement. It takes the form of "use A::B::C::D" or "use A::B::C::D as Z". If you omit the "as" part, the last name part will be used instead, so "use A::B::C::D" is equivalent to "use A::B::C::D as D". The use statement may be placed anywhere in the global scope (that is, not inside classes or functions) and takes effect from that point on to the rest of the file – but only in that file!

Think of "use" as a statement to create a per-file shorthand or alias map. When writing "use A::B::C::D as Z", an entry will be made for your current file that maps Z to A::B::C::D. Think of an associative array like array("Z" => "A::B::C::D")!

From that point on, whenever you use Z somewhere in your file to qualify a name, Z will be looked up in the map and A::B::C::D will be used instead, eventually adding further qualifications that followed Z.

You can use that to refer to classes from other namespaces, to simply shorten qualifications or to resolve name clashes at the point where you bring conflicting names together. Here are some examples for each case:

test3.php

use webfactory::NS1::X; require("ns1_x.php"); print get_class(new X()) . "\n";

When creating a new X instance, this will first look up "X" in the alias map and find "webfactory::NS1::X" which is the fully qualified name for the class defined in ns1_x.php. Whether or not there is a namespace statement in test3.php is of no relevance here. Cases like this one always have a fully qualified class name on the use statement.

It would also not make a difference if you changed the order of the first two lines. Remember, name aliasing is independant of loading class definitions and it's not unless the third line (the "new" statement) that both must have been taken care of.

test4.php

use webfactory::NS1; use webfactory::NS2; require("ns1_x.php"); require("ns2_x.php"); print get_class(new NS1::X()) . "\n"; print get_class(new NS2::X()) . "\n";

This will put two mappings into the alias table. "NS1" maps to "webfactory::NS1", so "new NS1::X" is just shorthand for "new webfactory::NS1::X". The same applies for NS2.

If a use statement refers to a namespace name and creates an alias for it, you will always have to use an additional qualification like "alias::classname" to refer to a class. In the last example it was obvious that a qualification through "NS1::" or "NS2::" was necessary to make clear which class is meant.

Please note that "use webfactory::NS1" does not make all classes defined in the webfactory::NS1 namespace available. As said above, PHP has no idea of your codebase and the classes you might have defined in any given namespace. So you cannot import all classes from a given namespace with a single use statement.

test5.php

use webfactory::NS1::X; use webfactory::NS2::X as AnotherX; require("ns1_x.php"); require("ns2_x.php"); print get_class(new X()) . "\n"; print get_class(new AnotherX()) . "\n";

This is the pattern to locally resolve name clashes. Assume you need to work with two external libraries you have no control over and both define a class X you want to use. Before namespaces, you were lost. Now, both libraries hopefully will use different namespaces (webfactory::NS1 and webfactory::NS2 in this case). The use statements now declare that X is the class X from the webfactory::NS1 namespace and that AnotherX is X from webfactory::NS2. If you understood the aliasing mechanism, you're probably not much surprised.

The resolution is "local" in that it only applies to the test5.php file. Other files won't be affected and if you just need to use (for example) the webfactory::NS2::X class somewhere else, it is perfectly ok to just "use webfactory::NS2::X" and create a "new X" there (or simply use the fully qualified name, of course).

Clashing Aliases

When you try to put something into the alias table and the key is already given, you get a name conflict. The lookup key must be unique because otherwise when using it it would not be clear which definition is to be used. Some examples:

test6.php

use foo::NS1; use bar::NS1;

You will get an fatal errror "Cannot use bar::NS1 as NS1 because the name is already in use". In this case the ambiguity even is in finding the NS1 namespace: "NS1::X" might refer to foo::NS1::X or bar::NS1::X. Similar one:

test7.php

use foo::NS1::X; use bar::NS2::X;

In this case, X refers to a class but it is unclear to which one. Remember, using namespaces has nothing to do with loading class definitions. You could require() both files that define X in the foo::NS1 and bar::NS1 namespaces and it would be no problem at all because both definitions differ in their fully qualified names. It's the use statement itself that creates the conflict as both statements want to use the same key in the alias map. In test6.php and test7.php, I do not even require the class definitions and the alias map is never looked at for name resolutions.

Names in Namespaces

Now here comes an extra rule. Maybe it is not what really happens under the hood, but the explanation closely matches what can be observed.

When you put a "namespace A::B" statement on top of your file and define a name Z in it (remember my definition of "defining" from above), an alias "Z" for "A::B::Z" will be put into your file-local alias table. That allows you to use the new name right away – and that is how my example test1.php way up worked: By defining class X in namespace webfactory::NS1, we implicitly ended up with a X-to-webfactory::NS1::X mapping in that file.

Together with the preceding section, you should understand that the following will cause a fatal error on the third line because the "extra rule" tries to add an alias for a name that is taken.

test8.php

namespace webfactory::NS1; use webfactory::NS2::X; class X {}

The Edge Cases

Re-read the extra rule and take it to the limits: The alias is made up when you define a name and it is put into the alias table for the file the definition takes place in. So what happens when the definition is sourced out to another file? Consider:

test9.php

namespace webfactory::NS1; require("ns1_x.php"); print get_class(new X()) . "\n";

For the ns1_x.php file, the extra rule will create an alias. But as the alias table is per-file, in test9.php there is no X entry.

Now what happens when a new X is created? Blurring the difference between defining a class and requiring() the definition, the following extra-extra rule kicks in:

First, the engine checks if X is a known alias – here it isn't. It then considers X as part of the current namespace. That is, it checks if a class with the fully qualified name "webfactory::NS1::X" has been defined somewhere. This time, it will find and use the definition from ns1_x.php.

A Kind Of Magic*

I hope that the following will be a pathological case in practice, but the result may be surprising enough to be documented.

test10.php

namespace webfactory::NS1; use webfactory::NS2::X; require("ns1_x.php"); require("ns2_x.php"); print get_class(new X()) . "\n"; print get_class(new webfactory::NS1::X()) . "\n";

This is basically the same as test9.php, with one subtle difference. This time we "use webfactory::NS2::X" to make up an alias table entry for X. In this case, the extra-extra rule does not trigger and you will get "webfactory::NS2::X" for the first print. Creating aliases with "use" takes precedence over using names from the same namespace when these are defined in another file.

If you had written "class X {}" instead of require("ns1_x.php") – and ns1_x.php does no more than that in the same namespace as test10.php – you'd get a fatal error because then the extra-rule would try to create the alias for X and finds that name taken by webfactory::NS2::X.

Final Remarks

When you got used to using __autoload() or the SPL equivalents to load in your class definitions, regardless of the namespaces defined or used in your files you will be passed the fully qualified names of the classes you need to load. For all the examples I provided here that is always what get_class() returned. If you omitted the require() calls from test10.php and had set up an __autoload handler instead, it would have been called for webfactory::NS2::X and webfactory::NS1::X on the first and second print line, respectively.

When using variable class names as in "$foo = new $bar()", where $bar holds the name of the class you want, name resolution rules don't apply so you will have to work with fully qualified names. Stanislav Malyshev is unsure if that is a shortcoming. (A short sidenote on this pattern: Interestingly, your __autoload() handler will be passed the value of $bar, no matter if it's a syntactically valid class name. That might be an interesting remote code inclusion vector if your handler implementation is too naïve >:-)).

Although I always used the "new" operator for demonstration purposes in this article, the same rules for resolving names apply in other cases where you refer to class names – think of type hints in method signatures, for example. The serialize() function will use the fully qualified class name as well to allow for correct de-serialization later on.

I hope this article helped you to understand how or why things work as they work. If you found it helpful or have any remarks, I'd be glad if you leave a comment.

Update 2007-12-06

Currently, there's a lot of discussion on the PHP Internals Mailing List about wheter or not the current namespace implementation makes sense and solves the problems people have in practice or if it would be wise not to ship it at all. So before you start updating your codebase, relax and wait how things evolve - you still have plenty of time before PHP 5.3 will be available and used at a large scale. At last, the first official documentation is available.

(* Courtesy of Queen, 1986)

Was this helpful for you? Let @webfactory or @mpdude_de know!