techhub.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
A hub primarily for passionate technologists, but everyone is welcome

Administered by:

Server stats:

4.6K
active users

#tts

4 posts4 participants0 posts today

Spotify isn't working with GrapheneOS.

There seems to be fixes, but I think I'm just going to use this as an excuse to cancel spotify and use other things.

Anyone have a recommendation for #foss #TTS #ereaders ?

If I can't get my audiobooks reliably from Spotify, then I think a decent TTS Engine with epub capabilities would solve the problem. I have one through Google Play I was using but it wasn't foss. Pretty nice though.

I wrote this blueprint for a web app that would make it easier for people to build voices and languages for different TTS engines. It's vague, but it's a start if anyone wants to contribute to it or eventually create the real thing. Boosts appreciated, as always. github.com/lower-elements/Voic #TTS #Accessibility #AI #ML

An easy way to create voices for any TTS engine. Contribute to lower-elements/Voice-Creator-Studio development by creating an account on GitHub.
GitHubGitHub - lower-elements/Voice-Creator-Studio: An easy way to create voices for any TTS engineAn easy way to create voices for any TTS engine. Contribute to lower-elements/Voice-Creator-Studio development by creating an account on GitHub.

Synthetic Poetry

shkspr.mobi/blog/2021/07/synth

I've been experimenting with Amazon's Polly service. It's their fancy text-to-sort-of-human-style-speech system. Think "Alexa" but with a variety of voices, genders, and accents.

Here's "Brian" - their English, male, received pronunciation voice - reading John Betjeman's poem "Slough":

https://shkspr.mobi/blog/wp-content/uploads/2021/07/slough.mp4

The pronunciation of all the words is incredibly lifelike. If you heard it on the radio, it might sound like a half-familiar BBC presenter. It has a calm, even tone which suits the poem splendidly.

The rhythm is also spot on. That's mostly a function of the short lines and helpful punctuation the poem contains. Much like iambic pentameter, or a limerick, the syllables lend themselves to a specific and identifiable cadence.

But the emphasis is all wrong. The poem just... ends. There's no sense of finality in the tone. You'd expect a competent reader to recognise "tinned minds" as being worthy of stressing. Polly does have some capability to mark specific words for emphasis, but it's all very manual.

There's no synthetic emotion. Do you feel the rage, desperation, sadness, hopelessness of the poem? While Polly has some SSML (Speech Synthesis Markup Language) support - the range of emotions it can express are severely limited. And, again, must be applied manually.

"I used to be an adventurer like you, but then i took an arrow in the knee!"

One of the reasons stock phrases pop up so often in video games is that it is expensive to write and record thousands of different lines of dialogue.

We're almost at a stage where a computer can procedurally generate lines for background characters to speak, and then "record" an audio version in an array of styles. No more expensive voice actors, no more memetic references for in-group homophily. Each player of a game will have a completely different dialogue experience.

But the bit that we're still missing is the automation of emphasis and emotion and comic timing and understatement and... all the things which trained actors spend years learning how to do successfully.

In 2011, the film critic Roger Ebert had surgery which eliminated his voice. He proposed the following "Ebert Test" for synthetic voices:

If the computer can successfully tell a joke, and do the timing and delivery, as well as Henny Youngman, then that’s the voice I want.

We're so close, I can taste it. The Turing Test for realistic voices is whether they can move the audience to tears with poetry.

A robot taxi driver.
Terence Eden’s Blog · Synthetic Poetry
More from Terence Eden
#AI#Amazon#tts

1KB JS Numbers Station

shkspr.mobi/blog/2025/07/1kb-j

Code Golf is the art/science of creating wonderful little demos in an artificially constrained environment. This year the js1024 competition was looking for entries with the theme of "Creepy".

I am not a serious bit-twiddler. I can't create JS shaders which produce intricate 3D worlds in a scrap of code. But I can use slightly obscure JavaScript APIs!

There's something deliciously creepy about Numbers Stations - the weird radio frequencies which broadcast seemingly random numbers and words. Are they spies communicating? Commands for nuclear missiles? Long range radio propagation tests? Who knows!

So I decided to build one. Play with the demo.

Obviously, even the most extreme opus compression can't fit much audio into 1KB. Luckily, JavaScript has you covered! Most modern browsers have a built-in Text-To-Speech (TTS) API.

Here's the most basic example:

m = new SpeechSynthesisUtterance;m.text = "Hello";speechSynthesis.speak(m);

Run that JS and your computer will speak to you!

In order to make it creepy, I played about with the rate (how fast or slow it speaks) and the pitch (how high or low).

m.rate=Math.random();m.pitch=Math.random()*2;

It worked disturbingly well! High pitched drawls, rumbling gabbling, the languid cadence of a chattering friend. All rather creepy.

But what could I make it say? Getting it to read out numbers is pretty easy - this will generate a random integer:

s = Math.ceil( Math.random()*1000 );

But a list of words would be tricky. There's not much space in 1,024 bytes for anything complex. The rules say I can't use any external resources; so are there any internal sources of words? Yes!

Object.getOwnPropertyNames( globalThis );

That gets all the properties of the global object which are available to the browser! Depending on your browser, that's over 1,000 words!

But there's a slight problem. Many of them are quite "computery" words like "ReferenceError", "URIError", "Float16Array". I wanted all the single words - that is, anything which only has one capital letter and that's at the start.

const l = (n) => {    return ((n.match(/[A-Z]/g) || []).length === 1 && (n.charAt(0).match(/[A-Z]/g) || []).length === 1);};//   Get a random result from the filters = Object.getOwnPropertyNames( globalThis ).filter( l ).sort( ()=>.5-Math.random() )[0]

Rather pleasingly, that brings back creepy words like "Event", "Atomics", and "Geolocation".

Of course, Numbers Stations don't just broadcast in English. The TTS system can vocalise in multiple languages.

//   Set the language to Russianm.lang = "ru-RU";

OK, but where do we get all those language strings from? Again, they're built in and can be retrieved randomly.

var e = window.speechSynthesis.getVoices();m.lang = e[ (Math.random()*e.length) |0 ]

If you pass the TTS the number 555 and ask it to speak German, it will read out fünfhundertfünfundfünfzig.

And, if you tell the TTS to speak an English word like "Worker" in a foreign language, it will pronounce it with an accent.

Randomly altering the pitch, speed, and voice to read out numbers and dissociated words produces, I think, a rather creepy effect.

If you want to test it out, you can press this button. I find that it works best in browsers with a good TTS engine - let me know how it sounds on your machine.

🅝🅤🅜🅑🅔🅡🅢 🅢🅣🅐🅣🅘🅞🅝

With the remaining few bytes at my disposal, I produced a quick-and-dirty random pattern using Unicode drawing blocks. It isn't very sophisticated, but it does have a little random animation to it.

You can play with all the js1024 entries - I would be delighted if you voted for mine.

Random monochrome tiles with the word Numbers Station superimposed.
Terence Eden’s Blog · 1KB JS Numbers Station
More from Terence Eden

🆕 blog! “1KB JS Numbers Station”

Code Golf is the art/science of creating wonderful little demos in an artificially constrained environment. This year the js1024 competition was looking for entries with the theme of "Creepy".

I am not a serious bit-twiddler. I can't create JS shaders which produce intricate 3D worlds in a scrap of code. But I can use slightly obscure JavaScript…

👀 Read more: shkspr.mobi/blog/2025/07/1kb-j

#code #HTML #javascript #tts

Random monochrome tiles with the word Numbers Station superimposed.
Terence Eden’s Blog · 1KB JS Numbers Station
More from Terence Eden

Hablemos de los TTS, antes era algo que ignoraba demasiado, pero recientemente lo veo como una utilidad para escribir mejor ciertas cosas, como el uso de las tildes

Hay una app de Google que viene prácticamente en cualquier teléfono (como casi todas) la cual cumple con esto, mi duda es, como se hara para quienes no tienen Google? Pues esta un poco complicada la cosa, al menos para mi que no me gusta tener dos apps si una sola puede hacerlo perfecto, aunque para evitar eso utilizo esta web pero para quienes lo quieran mas cómodo, aqui el dato

La cosa esta asi, la mayoría de alternativas no traen la funcion para que lea las palabras y eso que en teoria es un TTS, solo proporcionan la voz, asi que si quieren dicha funcion existe una app aparte...

La que proporciona la voz: RHVoice (Recomendada)

Para leer las palabras, con ayuda de la que proporciona la voz: TTS Util

Si, sería mas comodo que al menos RHVoice tuviera esa funcion ya implemetada, pero bueno, algo es algo...

www.text-to-speech.onlineFree Text to Speech Online Converter ToolsWe developed an online text-to-speech synthesis tool, which converts text into natural and smooth human voice, provides 100+ speakers for you to choose, supports multi-language, multi-dialect and Chinese-English mixing, and can configure audio flexibly parameter. It is widely used in news reading, travel navigation, intelligent hardware and notification broadcasting. And can convert the text content into MP3 files to download and save.

Here's a quick demo on how to enable TTS on the Nintendo Switch 2 from the home screen. Hopefully these menus are the same across all devices, though I have no way to know that for certain.

Edit: For other blind Switch/Switch 2 owners, I started a WhatsApp group to discuss the accessibility of the console and it's games. DM if you'd like to join.

Download: onj.me/media/Switch2_Accessibi
#Nintendo #Switch2 #Accessibility #TTS #ScreenReader

Replied in thread

@thelinuxEXP I really like Speech Note! It's a fantastic tool for quick and local voice transcription in multiple languages, created by @mkiol

It's incredibly handy for capturing thoughts on the go, conducting interviews, or making voice memos without worrying about language barriers. The app uses strictly locally running LLMs, and its ease of use makes it a standout choice for anyone needing offline transcription services.

I primarily use #WhisperAI for transcription and Piper for voice, but many other models are available as well.

It is available as flatpak and github.com/mkiol/dsnote

#TTS #transcription #TextToSpeech #translator translation #offline #machinetranslation #sailfishos #SpeechSynthesis #SpeechRecognition #speechtotext #nmt #linux-desktop #stt #asr #flatpak-applications #SpeechNote

völlig underrated:

#SpeechNote ist eine datenschutzfreundliche Linux-App, die Sprache in Text umwandelt (#STT), Text vorliest (auch Dateien) (#TTS) und übersetzt – alles lokal ohne Internetverbindung.
Viele Sprachen und Open-Source-Modelle stehen zum einbinden zur Verfügung!

Continued thread

Test un peu plus sérieux.

--

Commande utilisée

echo "Enfin, que dis-je, enfin, finalement, une synthèse vocale avec une voix française qui prononce les mots de manière intelligible ! Ça change tellement des voix sans prosodie !" | ./piper --model voices/fr_FR-upmc-medium.onnx --output-wav synt.wav

Quitte à utiliser une IA, autant utiliser la voix de GlaDOS.

Enfin une synthèse vocale (TTS) avec une voix en français qui fonctionne et est intelligible.

--

Autrement dit, juste un prétexte pour tester piper-tts

--

Commande utilisée

echo "Quitte à utiliser une IA ; autant utiliser la voix de Gla DOSSE." | ./piper --model voices/fr_FR-glados-medium.onnx --output-raw | aplay -r 22500 -f S16_LE -t raw -D pipewire -