Welcome to Software Development on Codidact!
Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.
Migrating HTML strings to a more secure alternative
Our team will have to migrate an old application to use a new tech stack. One of the features involves the usage of HTML editors which serialized the content as HTML and I am able to see valid HTML in the database for those fields.
The HTML editor is quite limited (text formatting, links, but no scripts).
If I am not mistaken, this opens a door for HTML injection and this issue should be removed for security reasons.
The only I see to solve this issue is switching from HTML to MarkDown.
I would like to understand how to correctly approach this task. I would split it in two:
The converter
It seems to be a way to automatically convert HTML to Markdown using a library like Turndown. Since I only use basic HTML elements, I do not expect surprises here.
The editor
Users should be able to use a MarkDown editor and an example I can use for an Angular SPA is angular-markdown-editor.
What I do not fully understand is how do I know if the converter output is compatible with the editor, as there seem to be a plethora of Markdown flavors.
I cannot find anything in the docs for the converter/library about the Markdown flavor they are generating/using. Is this a matter of trial and error or is there an easier way?
I am also open to alternative, if they solve the issue at least as easy as what I have mentioned above.
1 answer
Switching from HTML to Markdown to minimize risk of HTML injection doesn't make a lot of sense to me, since most Markdown implementations support a subset of HTML inline anyway. The better ones control what subset of HTML to allow by using a sanitizing library, such as (to take just one random example) Bleach. You might as well just use such a library yourself, if all you're concerned about is security. You'll get the same level of safety—i.e., you're fine as long as the underlying library isn't exploitable—and there will be fewer moving parts to audit and much less of the rest of the system needs to change or be migrated.
1 comment thread