techhub.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
A hub primarily for passionate technologists, but everyone is welcome

Administered by:

Server stats:

5.3K
active users

#ner

1 post1 participant0 posts today

🔠 Panel: More than Chatbots: Multimodal Large Language Models in Humanities Workflows

At #DHd2025, Nina Rastinger explores how well #AI handles abbreviations & NER:

✅ NER works well, even with small, low-cost models
❌ Abbreviations are tricky—costs & resource demands skyrocket
🚀 GPT o1 improves performance, even on abbreviations, but remains resource-intensive
Balancing accuracy & efficiency in text processing remains a challenge! ⚖️

🥁 We are happy to announce that we just published our first preprint on arXiv: "NER4all or Context is All You Need: Using LLMs for low-effort, high-performance NER on historical texts. A humanities informed approach".🎉

👉 arxiv.org/abs/2502.04351 👈

It is also our first endevour into collaborative work with such a large number of collaborators & contributors from the Chair of Digital History, NFDI4Memory's Methods Innovation Lab, & AI-Skills.

arXiv logo
arXiv.orgNER4all or Context is All You Need: Using LLMs for low-effort, high-performance NER on historical texts. A humanities informed approachNamed entity recognition (NER) is a core task for historical research in automatically establishing all references to people, places, events and the like. Yet, do to the high linguistic and genre diversity of sources, only limited canonisation of spellings, the level of required historical domain knowledge, and the scarcity of annotated training data, established approaches to natural language processing (NLP) have been both extremely expensive and yielded only unsatisfactory results in terms of recall and precision. Our paper introduces a new approach. We demonstrate how readily-available, state-of-the-art LLMs significantly outperform two leading NLP frameworks, spaCy and flair, for NER in historical documents by seven to twentytwo percent higher F1-Scores. Our ablation study shows how providing historical context to the task and a bit of persona modelling that turns focus away from a purely linguistic approach are core to a successful prompting strategy. We also demonstrate that, contrary to our expectations, providing increasing numbers of examples in few-shot approaches does not improve recall or precision below a threshold of 16-shot. In consequence, our approach democratises access to NER for all historians by removing the barrier of scripting languages and computational skills required for established NLP tools and instead leveraging natural language prompts and consumer-grade tools and frontends.

ReadMe2KG: Github ReadMe to Knowledge Graph #Challenge has been published as part of the Natural Scientific Language Processing and Research Knowledge Graphs #NSLP2025 workshop co-located with #eswc2025. This #NER task aims to complement the NDFI4DataScience KG via information extraction from GitHub README files.

task description: nfdi4ds.github.io/nslp2025/doc
website: codabench.org/competitions/539

@eswc_conf @GenAsefa @shufan @NFDI4DS #NFDIrocks #knowledgegraphs #semanticweb #nlp #informationextraction

Wir haben im Rahmen eines Projekts den Nachlass Joseph von #Laßberg digitalisiert, mit #eScriptorium Volltexte erzeugt und noch #NER mit spaCy (als Forschungsdaten) und Googles NL gemacht. Spannendes Projekt, oft festgestellt, dass Open-Source-Alternativen noch nicht so weit sind und viele Übersetzungsschritte brauchen. Trotzdem erfolgreich fertiggestellt. Steht jetzt öffentlich zur Verfügung.

digital.blb-karlsruhe.de/lassb

digital.blb-karlsruhe.deJoseph von Laßberg / Laßberg, Joseph von [1770-1855] [1-20]Joseph von Laßberg

Named Entry Recongition ist eine computergestützte Methode zur Erkennung und Klassifizierung von Eigennamen in Texten. Bei historischen Texten ergeben sich besondere Herausforderungen für NER, z.B. durch nicht-standardisierte Schreibweisen.

Selina Galka hat versucht, eigene #NER Modelle für die Memoiren der Gräfin von Schwerin zu trainieren. Die Ergebnisse sind gemischt:

memoiren.hypotheses.org/609

#NER, aber prompto! 🤖

Im morgigen #DigitalHistoryOFK demonstrieren Torsten Hiltmann, Martin Dröge & Nicole Dresselhaus (HU Berlin, #4Memory) am Bsp. des Baedeker-Reiseführers von 1921 die Potenziale von #LargeLanguageModels & prompt-basierten Ansätzen für die #NamedEntityRecognition in historischen Textquellen.

Offen für alle!

🔜 Wann? Mi., 26.06., 4-6 pm, Zoom
ℹ️ Abstract: dhistory.hypotheses.org/7870
____
#DigitalHistory #promptoNER #LLM #genAI @nfdi4memory @histodons