I have seen a lot of conflicting answers about this. Many people love to quote that php functions alone will not protect you from xss.

What XSS exactly can make it through htmlspecialchars and what can make it through htmlentities?

I understand the difference between the functions but not the different levels of xss protection you are left with. Could anyone explain?

will NOT protect you against UTF-7 XSS exploits, that still plague Internet Explorer, even in IE 9: http://securethoughts.com/2009/05/exploiting-ie8-utf-7-xss-vulnerability-using-local-redirection/

For instance:


$_GET[‘password’] = ‘asdf&ddddd”fancy˝quotes˝’;

echo htmlspecialchars($_GET[‘password’], ENT_COMPAT | ENT_HTML401, ‘UTF-8’) . “n”;

// Output: asdf&amp;ddddd&quot;fancyË

echo htmlentities($_GET[‘password’], ENT_COMPAT | ENT_HTML401, ‘UTF-8’) . “n”;

// Output: asdf&amp;ddddd&quot;fancy&Euml;quotes

You should always use htmlentities and very rarely use htmlspecialchars when sanitizing user input. ALso, you should always strip tags before. And for really important and secure sites, you should NEVER trust strip_tags(). Use

HTMLPurifier for PHP.

If PHP’s header command is used to set the charset

header(‘Content-Type: text/html; charset=utf-8’);

then htmlspecialchars and htmlentities should both be safe for output of HTML because XSS cannot then be achieved using UTF-7 encodings.

Please note that these functions should

not be used for output of values into JavaScript or CSS, because it would be possible to enter characters that enable the JavaScript or CSS to be escaped and put your site risk. Please see the XSS Prevention Cheat Sheet on how to appropriately handle these situations.

I’m not sure if you have found the answer you were looking for, but, I am also looking for an HTML cleaner. I have an application I am building and want to be able to take HTML code, possibly even Javascript, or other languages and put them into a MySQL DB without

causing issues nor allowing for XSS issues. I’ve found HTML Purifier and it appears to be the most developed and still maintained tool for cleaning up user submitted information on a PHP system. The page linked is their compairison page which can yield reasoning as to why their’s or another tool could be useful. Hope this helps!

You can’t sanitize all type of XSS with htmlspecialchars. htmlspecialchars may help you to protect against XSS inside HTML tags

or some quoted HTML attributes.

You have to sanitize the different type of XSS with their own sanitization method.

  • User input placed

    inside HTML:

  • <p.><?php echo $user_entered_variable; ?></p.>

    Attack vector: <script>alert(1)</script>

    This type of XSS can be sanitized using htmlspecialchars function because attacker need to use < and > to create new HTML tag.


    <p.><?php echo htmlspecialchars($user_entered_variable); ?></p.>

  • User input placed inside single quoted attribute:

  • <img title=”<?php echo htmlspecialchars($user_entered_variable);?>”/>

    Attack vector: ‘ onload=’alert(1)’ ‘

    htmlspecialchars will not encode single quote ‘ by default. You must turn it on using ENT_QUOTES option.


    <img title=”<?php echo htmlspecialchars($user_entered_variable,ENT_QUOTES);?>”/>

  • User input placed inside URL attributes: src, href, formaction, …

  • <iframe src=”https://boxhoidap.com/<?php echo htmlspecialchars($user_entered_variable); ?>”></iframe>

    <img src=”https://boxhoidap.com/<?php echo htmlspecialchars($user_entered_variable); ?>”>

    <a href=”https://boxhoidap.com/<?php echo htmlspecialchars($user_entered_variable); ?>”>Link</a>

    <script>function openLink(link)window.open(link);</script>

    <button onclick=”openLink(“https://boxhoidap.com/<?php echo htmlspecialchars($user_entered_variable); ?>”)”>JavaScript Window XSS</button>

    Attack vector: javascript:alert(1), javscript://alert(1)

    htmlspecialchars Document

    This function will not prevent those vectors because they haven’t any HTML special character. To prevent such attacks, you need to validate input as a URL.



    $user_entered_variable = htmlspecialchars($user_entered_variable);

    $isValidURL = filter_var($user_entered_variable, FILTER_VALIDATE_URL) !== false;




    <iframe src=”https://boxhoidap.com/<?php echo $user_entered_variable; ?>”></iframe>

    <img src=”https://boxhoidap.com/<?php echo $user_entered_variable; ?>”>

    <a href=”https://boxhoidap.com/<?php echo $user_entered_variable; ?>”>Link</a>

    <script>function openLink(link)window.open(link);</script>

    <button onclick=”openLink(“https://boxhoidap.com/<?php echo $user_entered_variable; ?>”)”>JavaScript Window XSS</button>

  • User input placed inside JavaScript tag without any quote

  • <script>

    var inputNumber = <?php echo $user_entered_variable; ?>




    in some cases, we can easily quote input and prevent attack by sanitizing it using htmlspecialchars but if we need input to be integer we can prevent XSS by using input validation.



    var inputNumber = <?php echo intval($user_entered_variable); ?>


    Always quote variables when it placed inside a HTML attribute and do a proper sanitization.

    Convert the predefined characters “<” (less than) and “>” (greater than) to HTML entities:

    $str = “This is some <b>bold</b> text.”;
    echo htmlspecialchars($str);

    The HTML output of the

    code above will be (View Source):

    <!DOCTYPE html>
    <body toàn thân>
    This is some &lt;b&gt;bold&lt;/b&gt; text.
    </body toàn thân>

    The browser output of the code above will be:

    This is some <b>bold</b> text.

    Definition and Usage

    The htmlspecialchars() function converts some predefined characters to HTML entities.

    The predefined

    characters are:

    • & (ampersand) becomes &amp;

    • ” (double quote) becomes &quot;

    • ‘ (single quote) becomes &#039;

    • < (less than) becomes &lt;

    • > (greater than) becomes &gt;

    Tip: To convert special HTML entities back to characters, use the htmlspecialchars_decode() function.






    Required. Specifies the string to convert


    Optional. Specifies how to handle quotes, invalid encoding and the used document type.

    The available quote styles are:

    • ENT_COMPAT – Default. Encodes only double quotes

    • ENT_QUOTES – Encodes double and single quotes

    • ENT_NOQUOTES – Does not encode any quotes

    Invalid encoding:

    • ENT_IGNORE – Ignores invalid encoding instead of having the function return an empty string. Should be avoided, as it may have security implications.


      – Replaces invalid encoding for a specified character set with a Unicode Replacement Character U+FFFD (UTF-8) or &#FFFD; instead of returning an empty string.

    • ENT_DISALLOWED – Replaces code points that are invalid in the specified doctype with a Unicode Replacement Character U+FFFD (UTF-8) or &#FFFD;

    Additional flags for specifying the used doctype:

    • ENT_HTML401 – Default. Handle code as HTML 4.01

    • ENT_HTML5 – Handle code as HTML 5

    • ENT_XML1 –

      Handle code as XML 1

    • ENT_XHTML – Handle code as XHTML


    Optional. A string that specifies which character-set to use.

    Allowed values are:

    • UTF-8 – Default. ASCII compatible multi-byte 8-bit Unicode

    • ISO-8859-1 – Western European

    • ISO-8859-15 – Western European (adds the Euro sign + French and Finnish letters missing in ISO-8859-1)

    • cp866 – DOS-specific Cyrillic charset

    • cp1251 – Windows-specific Cyrillic charset

    • cp1252 – Windows specific charset for Western European

    • KOI8-R – Russian

    • BIG5 – Traditional Chinese, mainly used in Taiwan

    • GB2312 – Simplified Chinese, national standard character set

    • BIG5-HKSCS – Big5 with Hong Kong extensions

    • Shift_JIS – Japanese

    • EUC-JP – Japanese

    • MacRoman – Character-set that was used by Mac OS

    Note: Unrecognized character-sets will be ignored and replaced by ISO-8859-1 in versions prior to PHP 5.4. As of PHP 5.4, it will be ignored an replaced by UTF-8.


    Optional. A boolean value that specifies whether to encode existing html entities or not.

    • TRUE – Default. Will convert everything

    • FALSE – Will not encode existing html entities

    Technical Details

    Return Value:Returns the converted string

    If the string contains invalid encoding, it will return an empty string, unless either the ENT_IGNORE or ENT_SUBSTITUTE flags are set

    PHP Version:4+

    Changelog:PHP 5.6 – Changed the default value for the character-set parameter to the value of the default charset (in configuration).
    PHP 5.4 – Changed the default value for the character-set parameter to UTF-8.
    PHP 5.3 – Added ENT_IGNORE constant.
    PHP 5.2.3 – Added the double_encode parameter.
    PHP 4.1 – Added the character-set parameter.

    More Examples


    Convert some predefined characters to HTML entities:

    $str = “Jane & ‘Tarzan'”;
    echo htmlspecialchars($str, ENT_COMPAT); // Will only convert double quotes
    echo “<br>”;
    echo htmlspecialchars($str, ENT_QUOTES); // Converts double and single quotes
    echo “<br>”;
    echo htmlspecialchars($str, ENT_NOQUOTES); // Does not convert any quotes

    The HTML

    output of the code above will be (View Source):

    <!DOCTYPE html>
    <body toàn thân>
    Jane &amp; ‘Tarzan'<br>
    Jane &amp; &#039;Tarzan&#039;<br>
    Jane &amp; ‘Tarzan’
    </body toàn thân>

    The browser output of the code above will be:

    Jane & ‘Tarzan’
    Jane & ‘Tarzan’
    Jane & ‘Tarzan’

    Convert double quotes to HTML entities:


    $str=”I love “PHP”.”;
    echo htmlspecialchars($str, ENT_QUOTES); // Converts double and single quotes

    The HTML output of the code above will be (View Source):

    <!DOCTYPE html>
    <body toàn thân>
    I love &quot;PHP&quot;.
    </body toàn thân>

    The browser output of the code above will be:

    I love “PHP”.

    does Htmlspecialchars return?

    The htmlspecialchars() function returns the converted string.

    What’s the difference between HTML entities () and htmlspecialchars ()?

    Difference between htmlentities() and htmlspecialchars() function: The only difference between these function is that htmlspecialchars() function convert the special characters to HTML entities whereas htmlentities() function

    convert all applicable characters to HTML entities.

    Does Htmlspecialchars prevent XSS?

    Using htmlspecialchars() function – The htmlspecialchars() function converts special characters to HTML entities. For a majority of web-apps, we can use this method and this is one of the most popular methods to prevent XSS. This process is also known as HTML Escaping.

    What is use of HTML

    entities in PHP?

    Definition and Usage The htmlentities() function converts characters to HTML entities. Tip: To convert HTML entities back to characters, use the html_entity_decode() function. Tip: Use the get_html_translation_table() function to return the translation table used by htmlentities().

