It is 2025 and it has been 0 days since I wasted way too much time due to f"¿Quéucked up character encoding.
It is 2025 and it has been 0 days since I wasted way too much time due to f"¿Quéucked up character encoding.
30 years in, UTF-8 remains a mystery.
Every time I read about how we encode characters as bit-patterns on computers, I feel impressed all over again.
#UTF8 #Ethiopia
https://www.unicode.org/charts/PDF/U1200.pdf
We're in 2025, and #utf8 is still hard for some people...
Turns out #sqlite3 does not have a built-in function to validate #utf8 - it just does GIGO (garbage in, garbage out): https://sqlite.org/invalidutf.html
So one quick hack of a utf8 validation loadable extension, I ran `select * from oc_filecache where isutf8(name) is null` and there was one (!!!) among the 3.5M entries (it was an external file).
`delete from oc_filecache where fileid=287791` and I'm ready to go again. Setting `cpupower frequency-set -g performance` and now it's just 20min left.
Jameson type body.
Edit: /etc/nginx/mime.types
Add: "text/markdown; charset=utf-8" md;
(note the quote marks around the string)
Don't forget: service nginx reload
In case anyone else is left wondering how to have #nginx serve markdown .md
files in utf-8, instead of basic ascii.
AH ! Dovecot a enfin un support de SMTPUTF-8 (les adresses mails internationalisées) si on le compile avec la bonne option (--enable-experimental-mail-utf8).
C'est le cas dans Debian (c'est sans doute le défaut de Dovecot).
L'option ("mail_utf8_extensions") n'est ensuite pas active par défaut.
Activons là et a nous les adresses mails en katakana inutilisable \o/
The grumpy serialisation format
Debugging #Perl #UTF8 #JSON #API issues on an Easter Sunday morning, why do I do this to myself.. ? #programming
A reusavolutionary "Digital Display Platform".
#Unicode is one of those little things in life that I can't help but smile about.
Is it perfect? No, of course not. Is it better than the alternative? Yes, so much so that every time I'm confronted with a long list of character encodings I can choose from, I feel a sense of relief when I find #UTF8 among them.
I wouldn't have thought it possible to standardize a single character encoding for everyone, and yet, somehow, there is just such a standard.
I decided on codeberg I still hate UTF8.
(why couldn't there be a size prefix?)
https://codeberg.org/Loganer/Sauce/src/branch/Base/src/Sauce/Function/UTF8/ToPoint.c
#Programming #UTF8 that was kind of annoying to implement but I can now store and a UTF8 string and print it out.
...at least a subset of utf8, I don't have checks for all the possible utf8 characters yet.
How quickly can you check that a string is valid unicode (UTF-8)? — https://lemire.me/blog/2018/05/09/how-quickly-can-you-check-that-a-string-is-valid-unicode-utf-8/ #utf8 #programming