Communities

Writing
Writing
Codidact Meta
Codidact Meta
The Great Outdoors
The Great Outdoors
Photography & Video
Photography & Video
Scientific Speculation
Scientific Speculation
Cooking
Cooking
Electrical Engineering
Electrical Engineering
Judaism
Judaism
Languages & Linguistics
Languages & Linguistics
Software Development
Software Development
Mathematics
Mathematics
Christianity
Christianity
Code Golf
Code Golf
Music
Music
Physics
Physics
Linux Systems
Linux Systems
Power Users
Power Users
Tabletop RPGs
Tabletop RPGs
Community Proposals
Community Proposals
tag:snake search within a tag
answers:0 unanswered questions
user:xxxx search by author id
score:0.5 posts with 0.5+ score
"snake oil" exact phrase
votes:4 posts with 4+ votes
created:<1w created < 1 week ago
post_type:xxxx type of post
Search help
Notifications
Mark all as read See all your notifications »
Q&A

Welcome to Software Development on Codidact!

Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.

Comments on Migrating HTML strings to a more secure alternative

Post

Migrating HTML strings to a more secure alternative

+0
−0

Our team will have to migrate an old application to use a new tech stack. One of the features involves the usage of HTML editors which serialized the content as HTML and I am able to see valid HTML in the database for those fields.

The HTML editor is quite limited (text formatting, links, but no scripts).

If I am not mistaken, this opens a door for HTML injection and this issue should be removed for security reasons.

The only I see to solve this issue is switching from HTML to MarkDown.

I would like to understand how to correctly approach this task. I would split it in two:

The converter

It seems to be a way to automatically convert HTML to Markdown using a library like Turndown. Since I only use basic HTML elements, I do not expect surprises here.

The editor

Users should be able to use a MarkDown editor and an example I can use for an Angular SPA is angular-markdown-editor.

What I do not fully understand is how do I know if the converter output is compatible with the editor, as there seem to be a plethora of Markdown flavors.

I cannot find anything in the docs for the converter/library about the Markdown flavor they are generating/using. Is this a matter of trial and error or is there an easier way?

I am also open to alternative, if they solve the issue at least as easy as what I have mentioned above.

History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.
Why should this post be closed?

1 comment thread

Make malicious HTML unrepresentable (1 comment)
Make malicious HTML unrepresentable
Derek Elkins‭ wrote almost 3 years ago

I probably won't make this into an answer, but if security is a high priority, I'd recommend using an internal representation that is simply incapable of representing malicious code. That is, rather than using a rich but complicated representation, like HTML, use a much narrower representation that supports only what you care about, e.g. a massively simplified DOM that you might store internally as JSON or a Protocol Buffer. I sketch a simple example in this answer. Then, the only way malicious output could be produced is if your code to serialize this internal representation to HTML produces it which is much easier to check. Sanitizers have a track record of being circumventable by malicious actors. There is a lot of engineering overhead to the approach I just described, though it has some additional benefits too, so it's often not worth it.