Mar 8 2005

Output Code

This is the method I use in the comments area to turn everything between code tags to html entities. A php function that takes a string as a parameter, and returns the result, all formated and ready to go.

function prep_code($string) {
  $string = htmlspecialchars($string);
  $string = str_replace("\t", '  ', $string);
  $string = str_replace('  ', '&nbsp;&nbsp;', $string);
  $string = nl2br($string);
  return $string;
}
 
$text = preg_replace("/( |\n|\r)/", "\n", $text);
$text = preg_replace("/\n\n+/", "\n\n", $text);
$text = preg_replace(
            "^(<code>)\n?([\S|\s]*?)\n?(</code>)^ie",
             "'<code>' . prep_code("$2") . '</code>'", 
            $text
        );

The `prepcode` function

First, this function takes it’s argument and converts common characters into their HTML entities. Next, we replace any tab characters with 2 spaces. While entering a tab character directly into a form textarea is not too common, using this in the function allows for copying code out of a text-editor and pasting it into the field. Once the tabs are converted, we convert any double-space characters to two consecutive non-breaking spaces. If I were to convert every space to a non-breaking space, any code that was too long for the column would force the column wider, thus breaking my floats. But only converting double spaces to non-breaking spaces allows for proper tab-formatting while letting the rest of the code wrap if necessary. Lastly, we convert any newline characters to break tags, and return the resulting string. This function is used in the last line from within pregreplace.

Making the Conversion

Next, any combination of newline characters (
, \r, \n) are converted to a single \n newline. This eliminates any newline discressions from browsers or copying and pasting. Then any group of 2 or more newline characters are converted to a set of two newline characters. This cleans up any messy user formatting. Once the newlines are taken care of, we run pregreplace to find anything between code tags, and pass it through the prepcode function.

The RegEx

I’ll walk you through this regular expression, for those who might not have the best grasp of regex. (I admit that I myself am far from understanding the full power of regular expressions.) We first look for the <code> element, followed optionally by a newline character. This optional newline character is found again later in the expression, and I do this so that we can write our comment code like this…

<ol>
  <li>List Item One</li>
  <li>List Item Two</li>
  <li>List Item Three</li>
</ol>

… and not end up with line breaks before and after our converted code. It just helps me visually. Then, anything between <code> and <code> (optionally including a single newline character on each end), is replaced with the result of the prep_code function.

That should do it. If you have a better way of doing it, I’d love to see it. I’m always open to learning/using new code.

Output Code

The prepcode function

Making the Conversion

The RegEx

The `prepcode` function