Evaluating Word Expansion for Multilingual Sentiment Analysis of Parliamentary Speech

Publikation: Bidrag til bog/antologi/rapport › Konferencebidrag i proceedings › Forskning › fagfællebedømt

Standard

Evaluating Word Expansion for Multilingual Sentiment Analysis of Parliamentary Speech. / Nikolova, Yana; Navarretta, Costanza.

Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024). ACL Anthology : European Language Resources Association, 2024. s. 6557–6563.

Publikation: Bidrag til bog/antologi/rapport › Konferencebidrag i proceedings › Forskning › fagfællebedømt

Harvard

Nikolova, Y & Navarretta, C 2024, Evaluating Word Expansion for Multilingual Sentiment Analysis of Parliamentary Speech. i Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024). European Language Resources Association, ACL Anthology, s. 6557–6563. <https://aclanthology.org/2024.lrec-main.581.pdf>

APA

Nikolova, Y., & Navarretta, C. (2024). Evaluating Word Expansion for Multilingual Sentiment Analysis of Parliamentary Speech. I Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024) (s. 6557–6563). European Language Resources Association. https://aclanthology.org/2024.lrec-main.581.pdf

Vancouver

Nikolova Y, Navarretta C. Evaluating Word Expansion for Multilingual Sentiment Analysis of Parliamentary Speech. I Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024). ACL Anthology: European Language Resources Association. 2024. s. 6557–6563

Author

Nikolova, Yana ; Navarretta, Costanza. / Evaluating Word Expansion for Multilingual Sentiment Analysis of Parliamentary Speech. Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024). ACL Anthology : European Language Resources Association, 2024. s. 6557–6563

Bibtex

@inproceedings{47049f9df1c847898ed4cae74c87922c,

title = "Evaluating Word Expansion for Multilingual Sentiment Analysis of Parliamentary Speech",

abstract = "This paper replicates and evaluates the word expansion (WE) method for sentiment lexicon generation from Rheault et al. (2016), applying it to two novel corpora of parliamentary speech from Denmark and Bulgaria. GloVe embeddings and vector similarity are leveraged to expand synonym seed lists with domain-specific terms from the speech corpora. The resulting Danish and Bulgarian lexica are compared to other multilingual lexica by analyzing a gold standard of speech excerpts annotated for sentiment. WE correlates best with hand-coded annotations for Danish, while a machine-translated Lexicoder dictionary does best for Bulgarian. WE performance is also found to be very sensitive to processing and scoring techniques, though this is also an issue with the other lexica. Overall, automatic lexicon translation best balances computational complexity and accuracy across both languages, but robust language-agnosticism remains elusive. Theoretical and practical problems of WE are discussed.",

author = "Yana Nikolova and Costanza Navarretta",

year = "2024",

language = "English",

pages = "6557–6563",

booktitle = "Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)",

publisher = "European Language Resources Association",

}

RIS

TY - GEN

T1 - Evaluating Word Expansion for Multilingual Sentiment Analysis of Parliamentary Speech

AU - Nikolova, Yana

AU - Navarretta, Costanza

PY - 2024

Y1 - 2024

N2 - This paper replicates and evaluates the word expansion (WE) method for sentiment lexicon generation from Rheault et al. (2016), applying it to two novel corpora of parliamentary speech from Denmark and Bulgaria. GloVe embeddings and vector similarity are leveraged to expand synonym seed lists with domain-specific terms from the speech corpora. The resulting Danish and Bulgarian lexica are compared to other multilingual lexica by analyzing a gold standard of speech excerpts annotated for sentiment. WE correlates best with hand-coded annotations for Danish, while a machine-translated Lexicoder dictionary does best for Bulgarian. WE performance is also found to be very sensitive to processing and scoring techniques, though this is also an issue with the other lexica. Overall, automatic lexicon translation best balances computational complexity and accuracy across both languages, but robust language-agnosticism remains elusive. Theoretical and practical problems of WE are discussed.

AB - This paper replicates and evaluates the word expansion (WE) method for sentiment lexicon generation from Rheault et al. (2016), applying it to two novel corpora of parliamentary speech from Denmark and Bulgaria. GloVe embeddings and vector similarity are leveraged to expand synonym seed lists with domain-specific terms from the speech corpora. The resulting Danish and Bulgarian lexica are compared to other multilingual lexica by analyzing a gold standard of speech excerpts annotated for sentiment. WE correlates best with hand-coded annotations for Danish, while a machine-translated Lexicoder dictionary does best for Bulgarian. WE performance is also found to be very sensitive to processing and scoring techniques, though this is also an issue with the other lexica. Overall, automatic lexicon translation best balances computational complexity and accuracy across both languages, but robust language-agnosticism remains elusive. Theoretical and practical problems of WE are discussed.

M3 - Article in proceedings

SP - 6557

EP - 6563

BT - Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)

PB - European Language Resources Association

CY - ACL Anthology

ER -

ID: 392922170

Biomedicinsk Institut