Closing HTML Tags Automatically

Go To StackoverFlow.com

0

I've allowed users to use <pre></pre> tags to display code in comments and articles, but I've come across a problem that I'm struggling to solve. When a user fails to close an HTML tag, for example:

    <pre>
        <html>
            <head>

            </head>
    </pre>

the comment appears to be blank. What I'm looking for is some sort of function that will automatically close any HTML tags that the user missed.

Thanks in advance.

2012-04-03 23:45
by NoName
What would you like, js or php? be carefull with user data especially when they could write html(or/and JavaScript) - funerr 2012-04-03 23:47
PHP please. I don't like the idea of users being able to override the verification by disabling JS - NoName 2012-04-03 23:49
I found this with a quick search: http://www.barattalo.it/html-fixer - uotonyh 2012-04-03 23:54
@uotonyh Thanks a bunch. : - NoName 2012-04-03 23:59


2

Well it's going to get nasty if you dont use a framework but your courage is admired. Hopefully this will be a nudge in the right direction.

The simplest, non-framework solution I can think of is using a stack to push and pop tags while parsing the string from the user.

pseudo code

userData = getUserData();
stack = array();
loop (line in userData) {
   matches = search for "<*>"; // may have multiple on one line
   loop (match in matches) {
      tagName = getTagNameFrom(match);
      if ("/" is not found) {
         push tagName on stack;
      } else if ("/" is found) {
         pop tagName off stack; 
         // There was an error if the stack is
         // empty or the tagName that was popped was not
         // the same.
      }
   }
}

This is by no means comprehensive and a framework is really recommended here, but hopefully it can help out a bit.

2012-04-04 00:08
by cha55son


0

You can use HTML Tidy to solve this problem. To finds and closes the unclosed tags, automatically.

Project Page

2012-04-03 23:48
by Starx
Thanks for the suggestion - I know I'm being fussy, but I'm not really for frameworks. I like to know exactly what's in my code and where. Your post is appreciated though - NoName 2012-04-03 23:50
@Terry you are going to have a long and tiring time writing EVERYTHING yourself instead of using frameworks that have all of the corner cases already resolved - nathanjosiah 2012-04-03 23:53
@Nathanjosiah I know, but I prefer it that way. I will use a framework, but only as a last resort - NoName 2012-04-03 23:54
@Terry I am all for writing things myself, but I'm not about to write my own HTML parser in javascript.. - nathanjosiah 2012-04-03 23:55
I agree with nathanjosiah, an anti-framework mentality is really not a good thing. When I meet someone who decides to reinvent the wheel constantly, I can pretty much assume that their code is fickle and unmaintable - kingcoyote 2012-04-04 00:31
Ads