RobLab – A script to remove all HTML tags

I mentioned this in a comment I wrote recently, so I thought it was time to add a script to RobLab that removes all HTML tags from an input string.

Enjoy! πŸ™‚

11 Comments

  • Egor Kloos says:

    This always makes me wonder how I store my own content. WYSIWYG editors in CMS systems like FCKEditor asume that content is always stored with it's layout formatting when in many cases this in not desired. Handy script, I'll give it a look see.

  • Robert Nyman says:

    Hi Egor,

    That was pretty much the reason for me writing this script to begin with. I needed to extract some text without formatting and it was stored with <acronym title="HyperText Markup language">HTML</acronym> tags mixed in it.

    Hopefully it helps you out when needed! πŸ™‚

  • Lim Chee Aun says:

    If I'm not mistaken, you forgot to put more <code>&</code>s to the &'s, for the code example on that page:

    <code>/*

    This line is optional, it replaces escaped brackets with real ones,

    i.e. is replaced with >

    */</code>

  • Robert Nyman says:

    Lim,

    Thanks for your comment.

    If you mean that the source code of the example page was invalid due to unescaped ampersands, you're completely right. I've fixed that now.

    Thanks!

  • Lim Chee Aun says:

    Uh, not I mean, there's one line there sounds a bit weird:

    i.e. < is replaced with < and > is replaced with >

    Somehow your commenting system here seems to have problems with HTML entities.. πŸ™‚ Sorry

  • Robert Nyman says:

    Lim,

    Ah, now I know what you mean. Yes, that looked a bit misleading, didn't it? πŸ™‚

    It's corrected now, and reads:

    <code>&lt; is replaced with < and &gt; is replaced with ></code>

    And yes, my commenting system can do some weird <acronym title="HyperText Markup Language">HTML</acronym> escaping things sometimes, sorry. πŸ™‚

  • Hakan Bilgin says:

    I prefer extending the string object with a new method:

    String.prototype.stripHTML = function() {return this.replace(//g, '');}

    Exampe:

    var foo = 'hello robert…';

    alert(foo.stripHTML());

    Less code, more handy…

    /hbi

  • Hakan Bilgin says:

    Hmm…My regular expression was deleted in the submission… a new try?

    String.prototype.stripHTML = function() {return this.replace(/<.*?>/g, '');}

  • Robert Nyman says:

    Hakan,

    Absolutely, that's a nice approach!

    Sorry about the code stripping, good that you made it with your second comment.

    πŸ™‚

  • Brendan says:

    How about the following:

    ((/[(|&gt)]/g,”))

    where the second “” is actually “ampersandgt”.

  • Brendan says:

    Gee, my last post really got disfigured by your HTML editor! It’s impossible to see what I wrote in there! Is there some other way for me to send you my code?

Leave a Reply to Hakan Bilgin Cancel reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.