Thursday, April 2, 2009

Your own PHP framework

This article is based on the previous "Perfect PHP Setup" one. I will explain everything here again, so you don't have to read it. This tutorial will show you how you can build your own PHP framework from scratch. You might ask yourself, why do that when there are so many third party frameworks out there? Well, here are a few reasons I can think of off the top of my head:
  • You understand how everything works, there's no "magic" code involved.
  • It is very light. It doesn't load a ton of code before getting to your code.
  • It's fully flexible, you can do whatever you want with it, even modify it later in your project if you need more functionality.
  • When something doesn't work, it's much easier for you to find the problem because you know all the code involved in your project. You don't have to seek support at whoever made your framework.
  • Hackers target websites using known frameworks, because they have known bugs / exploits. If you use your own framework, you are somewhat safer (if you code right), paradoxically!
Ok, let's get to business. First, let's get an overview of what we are going to do:
  1. Set up an Apache VirtualHost and redirect all unknown requests to one main script. If you use some other webserver, you will have to adapt this step to your software.
  2. Create our main script that will handle all dynamic requests. Note that this is how other server-side languages work by default, like Python with WSGI.
  3. Create a really basic "Hello World!" application on top of our framework.
1. Setting up Apache - we need to create a VirtualHost, because our framework is designed to work at the root directory of the website. You can change this, but I will not cover it here. A minimal VirtualHost configuration: <VirtualHost *:80> ServerName mysite.localhost DocumentRoot /var/www/mysite <Directory /> Options FollowSymLinks AllowOverride None </Directory> <Directory /var/www/mysite> Options Indexes FollowSymLinks MultiViews AllowOverride All Order allow,deny allow from all </Directory> </VirtualHost> Obviously, mysite.localhost will not be resolved to your local machine. To fix that, edit your hosts file and add "mysite.localhost" to the line starting with "127.0.0.1". Here's an example hosts file: 127.0.0.1 localhost mysite.localhost Restart Apache and enter http://mysite.localhost/ in your browser. You will probably get a 404 Not Found error because there's nothing in the /var/www/mysite directory, or it doesn't even exist (you don't have to use exactly this directory, use whatever you want, this is just an example). The next thing we need to do is tell Apache to call our main "dispatcher" script for all unknown requests. By unknown request I mean a request for a file that doesn't exist on the hard drive. For example if you have a folder images/ with a file pic.png inside, entering http://mysite.localhost/images/pic.png would give you that image. A request for http://mysite.localhost/images/picX.png would, on the other hand, call your main dispatcher script that will realize that the file doesn't exist and give out a 404 Not Found error. To achieve this, put this text in a file called .htaccess in your /var/www/mysite (or whatever you chose for your project) directory: RewriteEngine On RewriteCond %{REQUEST_FILENAME} !-f RewriteCond %{REQUEST_FILENAME} !-d RewriteRule . index.php Brief explanation of this rewrite rule: the first condition means "requested path is not a file", the second condition means "requested path is not a directory" and the rule says "rewrite everything that fulfills the above two conditions to index.php". Ok, we're done with Apache -- phew! 2. Creating the main "dispatcher" script - this is the index.php file that Apache will call for all dynamic requests. It is called a dispatcher script because it dispatches requests to other PHP scripts based on the path requested by the user. It will have a configurable modules directory in which it will look for scripts to load. This can be outside the DocumentRoot (for safety), but it has to be in Apache's reach. I will first give an example of how this script works. Let's say you visit http://mysite.localhost/abc/def/ghi. Here's what the script will do (let's say our modules directory is /var/mysite):
  • Look for a file named abc.php in /var/mysite/. If it is found, load it;
  • If it is not found, it will look for a file named def.php in /var/mysite/abc/ (if that directory exists);
  • If that is not found, look for a file named ghi.php in /var/mysite/abc/def/ (if that directory exists);
  • If ghi.php is not found, but there is a ghi folder in /var/mysite/abc/def/, look for an index.php file inside it;
  • If that is not found either, give a 404 Not Found error.
Ok, enough talking, let's see some code! I have tried my best to comment what everything does: <?php # Main dispatcher script # A (tiny) bit of configuration # This can be in another file and require()d $_C = Array ( # Directory in which to search for modules 'MOD_DIR' => './mod', # The default module for a directory 'DEF_MOD' => 'index', # The module to be loaded if no module fits the request 'NOT_FOUND' => './mod/not-found.php', # The module to be loaded if a possible attack is detected 'FORBIDDEN' => './mod/forbidden.php' ); # ------ # Do initializing things here # like connect to your database, start a user session etc. # ------ # Get the path part of the requested URI and remove any surrounding # dangerous characters, like . and / which could mean importing things # from outside the local directory $safe_path = parse_url(trim($_SERVER['REQUEST_URI'], './'), PHP_URL_PATH); # Get the parts from the requested path $_ARG = explode('/', $safe_path); # Prepare $_ARG -- urldecode everything for ($i=0; $i < count($_ARG); $i++) { $_ARG[$i] = urldecode($_ARG[$i]); } $mod_path = $_C['MOD_DIR']; # Search through the modules directory. We will descend into # subdirectories to search for modules, too. $i = 0; while ( is_dir($mod_path) && $i < count($_ARG) ) { $mod_path .= '/' . $_ARG[$i++]; } # if $mod_path is still a directory, we look for a default module # file in that directory. if ( is_dir($mod_path) ) $mod_path .= '/' . $_C['DEF_MOD']; $mod_path .= '.php'; if (!realpath($mod_path)) $mod_path = $_C['MOD_DIR'] . '/not-found.php'; # More safety checks -- basically, check if the final module path # is in the modules directory $mod_path = realpath($mod_path); $dir_name = realpath($_C['MOD_DIR']); if ( strpos($mod_path, $dir_name) !== 0 ) $mod_path = $_C['MOD_DIR'] . '/forbidden.php'; # Include the file. It will have access to the $_ARG variable # to make its life easier. require_once $mod_path; ?> Pretty small for a framework, eh? Sure, it's not ready for production.. but it's close! 3. Creating a basic "Hello World!" application - this is really basic and contains only three modules (besides not-found and forbidden). It illustrates how dispatching works and how flexible this is (you can do anything you want, you don't have to use any framework-specific classes or function calls). The code pretty much speaks for itself, these are the files and folders that I placed in the modules directory: index.php Hello World!<br> Let me count from 1 to 10: <?php for ($i=1; $i <= 10; $i++) echo $i . ' '; ?><br> <a href="/sayhello">Click here</a> if you want me to greet you! sayhello/index.php <form action="/sayhello/say" method="get"> Your name: <input type="text" name="name"> <input type="submit"> </form> sayhello/say.php <?php if ($_GET['name']) { header("Location: /sayhello/say/" . urlencode($_GET['name'])); } else { $name = $_ARG[2]; echo "Hello <strong>" . htmlentities($name) . "</strong>!"; } ?> not-found.php <?php header("HTTP/1.1 404 Not Found") ?> <h2>404 Not Found</h2> <p>The requested resource was not found<br><code><?php echo $safe_path ?></code></p> forbidden.php <?php header("HTTP/1.1 403 Forbidden") ?> <h2>403 Forbidden</h2> <p>You do not have access to the requested resource<br><code><?php echo $safe_path ?></code></p> Was that simple, or what? Here's a 1.7KB archive of the whole "project" (including the framework and the sample application): your-own-framework.tar.gz. I appreciate any feedback, positive or negative. Please note that English is not my mother tongue, so if you spot any language mistakes, please let me know. What I'm most interested in is if someone is able to "hack" this framework (i.e. make it load a script outside of its configured module path). As a conclusion, stop using complicated frameworks that you don't understand how they work. Make your own! :-)

1 comment:

  1. It is a good site. I like this site because it is a helpfull site. I love this site very much. Hire Wordpress Developers

    ReplyDelete