this post was submitted on 07 Jul 2024
49 points (81.8% liked)

Programming

17528 readers
237 users here now

Welcome to the main community in programming.dev! Feel free to post anything relating to programming here!

Cross posting is strongly encouraged in the instance. If you feel your post or another person's post makes sense in another community cross post into it.

Hope you enjoy the instance!

Rules

Rules

  • Follow the programming.dev instance rules
  • Keep content related to programming in some way
  • If you're posting long videos try to add in some form of tldr for those who don't want to watch videos

Wormhole

Follow the wormhole through a path of communities [email protected]



founded 2 years ago
MODERATORS
 

It must be a pain to make a text box with the ability to add bold, italic, heading, etc. you know? All the bold text, italics, and headings would need to be saved in a database column to be retrieved later in their correct positions.

I don't know, I am doing internship learning C# ASP (started 2 months ago), and just got a "Shower Thought" while making an edit post function.

top 27 comments
sorted by: hot top controversial new old
[–] [email protected] 60 points 4 months ago (2 children)

That is a very unlikely approach.

Rich text in the modern world is almost exclusively solved by using markdown because it's such a trivial solution.

In previous words it was usually solved either using range tags (similar to HTML, sometimes literally HTML, more often custom stuff) or embedded boundary markers (something that marked a new boundary and then had a full definition of the styles to follow, sometimes omitting styles that didn't change, often times in some insanely dense binary format for predictable scanning).

Usually, it's more sane to embed formatting in the string itself rather than having styling separately defined (i.e. CSS, kinda). Because otherwise storage would be a huge pain and reading would require a lot of non-consecutive disk scans.

[–] [email protected] 7 points 4 months ago (2 children)

Usually, it’s more sane to embed formatting in the string itself rather than having styling separately defined (i.e. CSS, kinda).

like this: <b>Bold Text</b>?

[–] [email protected] 12 points 4 months ago

Yes, but usually not actual HTML because then there are a lot of security issues to address. BBCode might even be a better choice, i.e. [b]Bold Text[/b]

[–] [email protected] 6 points 4 months ago (1 children)

Indeed.

Source : I'm a dev.

[–] [email protected] 5 points 4 months ago (1 children)

Is there anyone in this instance who isn't a developer of some stack or another?

[–] [email protected] 6 points 4 months ago (1 children)
[–] [email protected] 5 points 4 months ago

...fuck. You're right.

[–] [email protected] -4 points 4 months ago (1 children)

Rich text in the modern world is almost exclusively solved by using markdown because it’s such a trivial solution.

citation needed

markdown is not a trivial solution: there are many different implementations, it's a barrier for non technical people and it allows you to embed any html, so you need an additional html sanitizer.

my definition of a "rich textbox" is a WYSIWYG field, and markdown does not help you with this?!

yes, you probably would not save the formatted text normalized over multiple database columns, and only use a single field for a the text with formatting embedded in html or another format, and another one with the text without formatting for possible full text search. but even if you would solve this using markdown (which limits you to a quite small subset of text formatting and bad extensibility) you would still need a good data format to store the formatted text in memory that allows you to render the text. and markdown does not help you with this either?!

[–] [email protected] 10 points 4 months ago* (last edited 4 months ago) (1 children)

If I wanted a WYSIWYG field I'd probably still use markdown. I could add the buttons to properly inject markdown symbol and use a JS markdown renderer for the text field. Tbh I'd be amazed if there weren't at least a dozen out-of-the-box packages that included a live rendered text area with a widget array.

In this instance I'm not advocating for markdown as a user interface but just using it as a quick and dirty markup language. Be aware that if you turn to HTML, you'd be adopting responsibility for a lot of non-trivial security issues. If the customization went beyond markdown (into, for instance, fonts) you'd need a more complex solution so you'd likely want to investigate other tag or boundary marker based markup languages out there. Markdown is just simple and has ten billion implementations out there.

[–] [email protected] 4 points 4 months ago* (last edited 4 months ago)

Probably not. Having actually played with making a WYSIWYG editor as a learning project markdown is too simplistic for the formatting needs of any non-trivial text editing, as a serialized storage format.

You almost always end up back with your own data structure that you serialize into something like XML for storage. Or you end up supporting HTML or non-spec compliant syntax in your markdown.

And if you care about performance, you're not actually working with XML, HTML, or Markdown in memory. You're working with a data structure that you have to serialize/deserialize from your storage format. This is where markdown becomes a bit more tedious since it's not as easy to work with in this manner, and you end up with a weird parsing layer in-between the markdown and your runtime data structures.

The commenter that's downvoted is more correct than not IMHO (Also why are we downloading discussions??). Markdown is ill suited for "most WYSIWYG needs". It tends to get augmented with XML or custom non-spec compliant syntax. The spec poorly supports layout (columns, image & media positioning, sizing...etc) and styling (font color, size, family, backgrounds...etc)

[–] [email protected] 20 points 4 months ago* (last edited 4 months ago) (1 children)

There are markup languages for this purpose. And you store the rich text as normal text in that markup language. For the most part.

It's typically an XML or XML-like language, or bb-codes. MS Word for example uses XML to store the markup data for the rich text.

Simpler and more limited text needs tend to use markdown these days, like Lemmy, or most text fields on GitHub.

There's no need to include complex technology stacks into it!

Now the real hard part is the rendering engine for WYSIWYG. That's a nightmare.

[–] montar 1 points 4 months ago (1 children)

Markdown has one huge adventage, if you remember bit of syntax you can type it right from your finger, it's a great speedup for me. I personally prefer orgmode but noone uses that in XXI century.

[–] [email protected] 2 points 4 months ago

Yeah, but that's not what we're talking about here.

RTF has many more features than markdown can reasonably support, even with your personal, custom, syntaxes that no one else knows :/

I use markdown for everything, as much as possible, but in the context of creating a RTF WYSIWYG editor with non-trivial layout & styling needs it's a no go.

[–] [email protected] 18 points 4 months ago* (last edited 4 months ago)

All the bold text, italics, and headings would need to be saved in a database column to be retrieved later in their correct positions.

Nobody does that. People simply store HTML, Markdown or BB code. Check out TinyMCE, Milkdown, tui-editor, stackedit... all of them have a "see source" button and you'll see the text with the formatting code right there.

[–] [email protected] 12 points 4 months ago* (last edited 4 months ago) (2 children)

You'd save it to the database in the same field as the rest of the text. You don't store the positions or anything like that - you'd store the text with HTML and have the front end render it as expected.

For instance, the database could have the following text:

Hello <strong>World</strong>

And the front end just renders HTML.

Alternatively, you could store Markdown syntax if you're hesitant to allow HTML.

EDIT: as always, if you store raw HTML, don't forget to sanitize it.

[–] [email protected] 10 points 4 months ago

So long as you have robust data sanitization on the backend to prevent XSS and HTML injection attacks...

If you can get away with just using Markdown, you should definitely use that instead of full HTML.

[–] [email protected] 7 points 4 months ago (1 children)

Fuck me, I hope you don't just render whatever HTML the user gave you!

[–] [email protected] 5 points 4 months ago

Of course not lol. The CMS I usually use stores it as HTML in the database, so I have a go-to HTML sanitization plugin with a tag whitelist. I wish it used markdown or something similar under the hood instead, but it is what it is.

[–] [email protected] 8 points 4 months ago (1 children)

You mean like the comment fields we're using right here on lemmy?

As others have pointed out, it's usually some markdown that's embedded within the text. Lemmy is using a format that's actually called "markdown" if I'm not mistaken, or a slight variation/subset thereof.

I've gotten used to the double-star for bold and what not to the point that it annoys me when some message client or whatever doesn't support it. I share code snippets with people fairly often, and the code markdown is particularly useful to maintain its legibility.

[–] [email protected] 9 points 4 months ago* (last edited 4 months ago)

If you're looking for the general word, it's "markup". See also Hypertext Markup Language. But yes, Lemmy uses Markdown specifically.

And yeah, at this point Markdown is just the standard for rich text. I think it's a pretty solid subset of functionality to use everywhere.

[–] [email protected] 8 points 4 months ago

Yes, it is a huge pain, especially if you want to have round-trip interoperability with humans using markup. Wikipedia had a major challenge with this when they decided to add a rich text editor alongside wiki markup.

[–] [email protected] 8 points 4 months ago (1 children)

I came across something like that in a proprietary "epub" format. Not because of formatting/styling but because of crossreferencing and footnotes it stored every word in a database with its position.

[–] [email protected] 3 points 4 months ago

Gosh, if you're able to share that I'd love to see that train wreck.

[–] [email protected] 8 points 4 months ago

Just make a normal text box and enter a wallet address containing a bitcoin. Tadaaa

[–] [email protected] 6 points 4 months ago

*Vietnam flashback from dealing with RTF*

[–] [email protected] 4 points 4 months ago

Now also write the font rendering engine and you're all set!

[–] [email protected] 2 points 4 months ago

You can always take a look how for example Windows 3.11 and earlier did it for their *.rtf file format and their "write.exe" editor / viewer / renderer (if you want to call it that way).