Commit 5509dd02 authored by Richard Mansfield's avatar Richard Mansfield
Browse files

Strip out entities that are not ampersands but begin with 'amp' or '38' (fixes previous fix)

parent 87676291
......@@ -178,7 +178,7 @@ class html2text
'/&(bull|#149|#8226);/i', // Bullet
'/&(pound|#163);/i', // Pound sign
'/&(euro|#8364);/i', // Euro sign
'/&(?!(amp|#38))[^&;]+;/i', // Unknown/unhandled entities
'/&(?!(amp;|#38;))[^&;]+;/i', // Unknown/unhandled entities
'/&(amp|#38);/i', // Ampersand
'/[ ]{2,}/' // Runs of spaces, post-handling
);
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment