What is a token?

At its most basic, we want a token to say whether it is an operator or an operand. As we already have the definitions FOO_NUMBER, FOO_VARIABLE, etc, we could just create an array for operators that contains FOO_PRINT and FOO_ASSIGNEQUALS, but it is faster and easier just to specifically tell each token whether it is an operator or an operand.

To do this, we need two more define() lines:

define("IS_OPERATOR", 0);
define("IS_OPERAND", 1);

Again, the numbers assigned to the constants are irrelevant, as long as they are different.

The two "obvious" properties of each token are the token it actually is, e.g. FOO_NUMBER or FOO_ASSIGNEQUALS, and also its value, e.g. 29 or "abcdefg". Operators do not have a value, as their entire meaning is encapsulated inside their token definition.

With this simple definition of a token we have enough information to create a class, token, that will do all we need of it:

class token {
    public $type;
    public $token;
    public $val;

    public function __construct($type, $token, $val) {
        $this->type = $type;
        $this->token = $token;
        $this->val = $val;

The constructor function is there so that we can pass in all the parameters for the token when creating it. As you can see, the use of objects is not really necessary because all the class variables are marked public - using an array might seem to be easier. However, using an object is preferable, as it gives you flexibility in the future, and also uses arguably easier to read syntax.

That is all there is to understanding tokens, so back to the parser: how does parsing work?


Next chapter: How parsing works >>

Previous chapter: How to parse text into tokens

Jump to:


Home: Table of Contents

Copyright ©2015 Paul Hudson. Follow me: @twostraws.