techhub.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
A hub primarily for passionate technologists, but everyone is welcome

Administered by:

Server stats:

4.7K
active users

InfoQ

Running requires significant computational power, which scales with model size and context length.

Ye (Charlotte) Qi from shares strategies to fit models across hardware types, plus techniques to optimize inference latency & throughput.

🎥 Watch the video: bit.ly/3FCugyK

📄 Full included