Strip HTML Tags Using JavaScript

I chucked together a function that allows some tags to be kept, similar to how the php function works.

As with PHP it comes with the following two caveats:

Because strip_tags() does not actually validate the HTML, partial or broken tags can result in the removal of more text/data than expected.


Some Related Posts:-

This function does not modify any attributes on the tags that you allow using allowable_tags, including the style and onmouseover attributes that a mischievous user may abuse when posting text that will be shown to other users.

 * Native javascript function to emulate the PHP function strip_tags.
 * @param {string} str The original HTML string to filter.
 * @param {array|string} allowable_tags A tag name or array of tag
 * names to keep. Intergers, objects, and strings that don't follow the
 * standard tag format of a letter followed by numbers and letters will
 * be ignored. This means that invalid tags will also be removed.
 * @return {string} The filtered HTML string.
function strip_tags(str, allowable_tags) {
    allowable_tags = [].concat(allowable_tags);
    var keep = '';
    allowable_tags.forEach(function(tag) {
        if (('' + tag).match(/^[a-z][a-z0-9]+$/i))
            keep += (keep.length ? '|' : '') + tag;
    } );
    return str.replace(new RegExp(']+>', 'ig'), '');

Additional checks have been implemented to prevent invalid tags from being removed where possible, by ensuring that the opening of each tag starts with a potential tag name; it does not account for greater than symbols within attributes. Comments will be retained but can be removed with a similar regex.

var no_comments = strip_tags('This is not a comment. ').replace(//, '');

Leave a Reply

Your email address will not be published. Required fields are marked *