Domain Adaptation for Text Classification with Weird Embeddings
p. 37-43
Résumé
Pre-trained word embeddings are often used to initialize deep learning models for text classification, as a way to inject precomputed lexical knowledge and boost the learning process. However, such embeddings are usually trained on generic corpora, while text classification tasks are often domain-specific. We propose a fully automated method to adapt pre-trained word embeddings to any given classification task, that needs no additional resource other than the original training set. The method is based on the concept of word weirdness, extended to score the words in the training set according to how characteristic they are with respect to the labels of a text classification dataset. The polarized weirdness scores are then used to update the word embeddings to reflect task-specific semantic shifts. Our experiments show that this method is beneficial to the performance of several text classification tasks in different languages.
Texte intégral
1. Introduction
1In recent years, the Natural Language Processing community has directed a great deal of effort towards text classification, in different declinations. The list of shared tasks proposed at the recent editions (2016–2019) of the International Workshop on Semantic Evaluation (SemEval) shows an increasing number of tasks that can be cast as text classification problems: given a text and a set of labels, choose the correct label to associate with the text. If the cardinality of the set of labels is two, we speak of binary classification, as opposed to multiclass classification. Furthermore, not all binary classification tasks are the same. When the labels indicate the presence or absence of a given phenomenon, we speak of a detection task.
2Classification tasks are mainly approached in a supervised fashion, where a labeled dataset is employed to train a classifier to map certain features of the input text to the probability of a certain label. Arguably, the most useful features in a NLP problem are the words that compose the text. However, in order to be processed by a machine learning algorithm, words need to be represented in a dense and machine readable format. Word embeddings solve this issue by providing vectorial representations of words where vectors that are close in the geometric space represent words that occur often in the same contexts. Among their applications, pre-trained word embeddings are a powerful source of knowledge to boost the performance of supervised models that aim at learning from textual instances.
3Several deep learning models compute word embeddings at training time. However, they can be initialized with pre-trained word embeddings, typically computed on the basis of concordances in large corpora. This kind of initialization not only boosts the training of the model, but it also represents a way of injecting precomputed world knowledge into a model otherwise trained on a (sometimes very specific) data set.
4An issue with word embedding models, including recent contextual embeddings such as , is that they are typically trained on general-purpose corpora. Therefore, they may fail to capture semantic shifts that occur in specific domains. For instance, in a dataset of online hate speech, negatively charged words such as insults often co-occur with words that would normally be considered neutral, but carry instead a negative signal in that particular context. More concretely, in a dataset of hate speech towards immigrant in the post-Trump U.S., a word that otherwise would be considered neutral such as wall carries a definite negative connotation.
5In this work, we try to capture this intuition computationally, and model this phenomenon in a word embedding space. We employ an automatic measure to score words in a labeled corpus according to their association with a given label (Section 3.1) and use this score in a fully automated method to adapt generic pre-trained word embeddings (Section 3.2). We test our method on existing benchmarks of hate speech detection (Section 4.1) and gender prediction (Section 4.2), reporting improvements in precision and recall.
2. Related Work
6Kameswara Sarma et al. (2018) propose a method to adapt generic word embeddings by computing domain specific word embeddings on a corpus of text from the target domain and aligning the two vector spaces, obtaining a performance boost on sentiment classification. Another recent approach is based on projecting the vector representations from two domain-specific spaces into a joint word embedding model (Barnes, Klinger, and Schulte im Walde 2018b), building on a similar method applied to cross-lingual word embedding projection (Barnes, Klinger, and Schulte im Walde 2018a). With respect to these works, the approach proposed in this paper is significantly more lightweight, acting directly on a generic word embedding model without the need to train a domain specific one.
7The word-level measure introduced in the next section is reminiscent of similar metrics from Information Theory, e.g., Information Content (Pedersen 2010), and measures of frequency distribution similarity such as Kullback-Leibler divergence (Kullback and Leibler 1951). However, in this paper we aimed at keeping the complexity of such computation low, in order to manually explore its effect on the word embeddings.
8In the domain of hate speech, several approaches mix word embeddings and supervised learning with domain-specific lexicons (e.g., dictionaries of hateful terms), as highlighted by the description of participant systems to recent evaluation campaigns (Fersini, Rosso, and Anzovino 2018; Bosco et al. 2018). These methods are computationally inexpensive, but require curated resources that are not always available for less represented languages.
3. Weirdness-based Embedding Adaptation
9In this section, we present our method for automatic domain adaptation of pre-trained word embeddings. The input of the procedure is a set of pre-trained word embeddings and a corpus of texts paired with labels.
3.1 Polarized Weirdness
10The Weirdness index was introduced by as an automatic metric to retrieve words characteristic of a special language with respect to their typical usage. According to this metric, a word is highly weird in a specific collection of documents if it occurs significantly more often in that context than in a general corpus. In practice, given a specialist text corpus and a general text corpus, the weirdness index of a word is the ratio of its relative frequencies in the respective corpora. Calling ws the frequency of the word w in the specialist language corpus, wg the frequency of the word w in the general language corpus, and ts and tg the total count of words the specialist and general language corpora respectively, the weirdness index of w is computed as:
11The weirdness index is used to retrieve words that are highly typical of a particular domain. For instance, in , the words dollar, government and market are extracted from the TREC-8 corpus, a collection of governmental and financial domain, by comparing their frequencies to the general domain British National Corpus.
12In this work, we propose a new application of the weirdness index to the task of text classification. Rather than comparing the frequencies of words from corpora of different domains, we compute the weirdness index based on the frequency of words occurring in labeled datasets. The mechanism is straightforward: instead of comparing the relative frequencies of a word in a special language corpus against a general language corpus, we compare the relative frequencies of a word as it occurs in the subset of a labeled dataset identified by one value of the label against its complement. Consider a labeled corpus where is an instance of text (e.g., an online comment), and li is the label associated with ei, belonging to a fixed set L (e.g., {positive, negative}).
13The polarized weirdness (Florio et al. 2020) of w with respect to a specific label is the ratio of the relative frequency of w in the subset over the relative frequency of w in the subset
14Here is an example of how polarized weirdness is computed. Consider a corpus of 100 instances, 50 of which labeled positive and 50 labeled negative. The total number of words in instances labeled positive is 3,000, while the total number of words in instances labeled negative is 2,000. The word good occurs 50 times in positive instances and 5 times in negative instances. Therefore its polarized weirdness with respect to the positive label is:
15However, the polarized weirdness of good with respect to the negative label is:
16indicating that good is much more indicative of positiveness than negativeness.
17Polarized weirdness can be computed at a low computational cost on any dataset labeled with categorical values, with just tokenization for preprocessing. The outcome of the calculation of the polarized weirdness index is a set of rankings, one for each label, over the vocabulary, there the top words in the ranking relative to a given label l are the most characteristic for that label.
3.2 Word Embedding Adaptation
18In Section 3.1, we introduced an automatic metric that allows us to compute how much a word is characteristic to a certain label. We use this information to transpose the vector representing words highly typical of a label closer to each other in the vector space. Formally, once a label has been decided and the polarized weirdness is computed with respect to it, for each pair of vectors in a word embedding model, representing words with polarized weirdness pw1 and pw2 respectively, we compute new representations:
19where α is a parameter controlling the extent of the adaptation. The result of the application of this algorithm is a new word embedding model over the same vocabulary as the original model, where pairs of word vectors are closer in the space to an extent proportional to their respective polarized weirdness score.
4. Experimental Evaluation
20We test the word embedding adaptation introduced in Section 3 by adapting pre-trained multilingual word embeddings to three different tasks. For each task, the polarized weirdness index is computed on the labeled training sets as described in Section 3.1, and the generic word embeddings are adapted to the particular task domain applying the algorithm described in Section 3.2.
21Our baseline model is a convolutional neural network (CNN) with a 64x8 hidden layer and Rectified Linear Units activation (ReLU), followed by a 4-size max pooling layer. We use the implementation from the Keras Python library1, with ADAM optimization (Kingma and Ba 2014), leaving the hyperparameters at their default value, except for optimization of learning rate (set between 10-2 and 10-3 depending on the dataset) and number of epochs (between 10 and 25).
22We use the multilingual word embeddings provided by Polyglot (Al-Rfou, Perozzi, and Skiena 2013). These are distributed word representations for over 100 languages trained on Wikipedia. The vector representations of words in Polyglot are 64-dimensional. The choice of this model is motivated by the need to have word embedding models for different languages that were created with the same method, to be able to measure improvements introduced merely by our adaptation method. In these experiments, we set α = 0.5.
4.1 Experiment 1: Multilingual Hate Speech Detection
23In the first experiment, the generic word embeddings are adapted to provide a better representation for words used in online messages containing hate speech towards women and immigrants. We use the dataset provided by the SemEval Task 5 (HatEval: Multilingual Detection of Hate Speech Against Immigrants and Women in Twitter), a public challenge where participants are invited to submit the predictions of systems for hate speech detection (Basile et al. 2019). In particular, we employ the data of the subtask A, where the prediction is binary (hateful vs. not hateful). The shared task website2 provides datasets in Spanish and English, already divided into training, development and test sets. The topics of the messages are mainly two, namely women and immigrants, in a fairly balanced proportion. In fact, the dataset has been created by querying the Twitter API with a set of keywords crafted to capture these two topics. The English dataset comprises 13,000 tweets (10,000 for training and 3,000 for testing), with about 42% of the messages labeled as hateful. The Spanish dataset is smaller (6,600 tweets in total, 5,000 for training and 1,600 for testing), and it follows a similar distribution of topics and labels as the English set. Following are two examples of tweets from the English HatEval data,, with their Hate Speech label:
I’d say electrify the water but that would kill wildlife. #SendThemBack
label: yes
Polish Prime Minister Mateusz Morawiecki insisted that Poland would push against any discussion on refugee relocations as part of the EU’s migration politics.
label: no
24Similarly, two examples of tweets from the Spanish HatEval data, with translation and label:
@rubenssambueza eres una basura de persona, lo cual no me sorprende porque eres SUDACA, y asi son los tercermundistas
@rubenssambueza you are garbage, which does not surprise me because you are a SUDACA, and so are third-worlders
label: yes
Yo creía que ese jueguito solo existía para los árabes, jajaja.
I thought that this little game was only for arabs, ahahah.
label:
25The polarized weirdness of the words in the HatEval datasets (English and Spanish) is computed on the respective training sets as the ratio of their relative frequency in hateful messages over their relative frequency in non hateful messages. A modified version of the Polyglot embeddings is then computed3 and the performance of the CNN using the adapted embeddings for initialization is compared with the performance obtained by initializing the CNN with the generic embeddings.
Table 1: Results of the English and Spanish Hate Speech Detection, for the negative (no-HS) and positive class (HS) and their macro-F1
no-HS | HS | Avg. | ||||||
Model | Acc. | Pr. | R. | F1 | Pr. | R. | F1 | F1 |
English | ||||||||
CNN | .468 | .567 | .401 | .470 | .398 | .564 | .466 | .468 |
CNN+W | .482 | .588 | .394 | .472 | .413 | .608 | .492 | .482 |
Spanish | ||||||||
CNN | .528 | .592 | .595 | .594 | .437 | .434 | .436 | .515 |
CNN+W | .527 | .614 | .497 | .549 | .450 | .568 | .502 | .527 |
26The results on the English dataset, presented in Table 1, show a clear improvement in the detection of hateful messages, leading to a +1.2% performance gain in macro-average F1-score. Recall is particularly impacted by the adapted embeddings, indicating that the modified model successfully helps in correcting false negatives.
27The results on the Spanish HatEval task dataset, presented in Table 1 are even better than on English, with improvements in precision and recall for both the positive and the negative class, and a total gain of almost 2% macro-averaged F1-score. Similarly to English, the largest improvement is measured on the recall.
Table 2: Examples of words from the HatEval datasets, showing how their vector representation moves to reflect the semantic shift. Particular words that are generally neutral get closer to offensive words in the hate speech context.
Word embeddings | Generic word | Offensive word | Semantic shift | Cosine distance |
Polyglot EN | wall | fuck | yes | 1.224 |
Polyglot EN + P.W. | wall | fuck | yes | 0.444 |
Polyglot EN | car | fuck | no | 1.279 |
Polyglot EN + P.W. | car | fuck | no | 1.413 |
Polyglot ES | directora (director (F)) | puta (whore) | yes | 1.271 |
Polyglot ES + P.W. | directora (director (F)) | puta (whore) | yes | 1.222 |
Polyglot ES | director (director (M)) | puta (whore) | no | 1.366 |
Polyglot ES + P.W. | director (director (M)) | puta (whore) | no | 1.411 |
28One of the advantages of the proposed method is that it is transparent with respect to the semantic shift computed on the pre-trained embeddings. Firstly, the words with the highest polarized weirdness index can be extracted, to gain insights into the specificity of the datasets. The top twenty weird words in the hateful English HatEval set are the following: nodaca, enddaca, kag, womensuck, @hillaryclinton, americafirst, trump2020, taxpayers, buildthewallnow, illegals, @senatemajldr, dreamer, buildthewall, they, @potus, walkawayfromdemocrat, votedemsout, wethepeople, illegalalien, backtheblue. The top twenty weird words in the hateful Spanish HatEval set with English translations are the following: mantero (street vendor), turista (tourist), negratas (nigger), caloría (calory), sanidad (healthcare), drogar (to drug), paises (countries), emigrante (immigrant), Hija (daughter), ZORRA (bitch), impuesto (tax), zorro (bitch (masculine)), totalmente (totally), lleno (full), invasor (invader), costumbre (custom), barrio (neighborhood), PAIS (country), Oye (hey), Españoles (Spaniards).
29Secondly, one can extract the word embeddings after the polarized weirdness adaptation is applied, and qualitatively inspect their respective position in the vector space. Table 2 shows how certain pairs of words become more related in the adapted space, while others are untouched by the process. The example in Spanish is particularly interesting (and worrying), where a misogynistic derogatory word (puta) becomes closer to the feminine inflection of “director” but not to the masculine inflection.
4.1 Experiment 2: Gender Prediction
30In the second experiment, we test our word embedding adaptation method in a different scenario, that is, the prediction of the gender of the author of messages. The assumption is that the most typical words used by each gender will cluster in the vector representation, thus helping the model discriminate them better.
31We use the dataset distributed for the Cross-Genre Gender Prediction in Italian (GxG) shared task of the 2018 edition of EVALITA, the evaluation campaign of language technologies for Italian (Dell’Orletta and Nissim 2018). The participants to the shared task are invited to submit the prediction of their system on a set of short and medium-length texts in Italian from different sources, including social media, news articles and personal diaries, on the gender of the author. The task is therefore a binary classification, evaluated by means of accuracy. We downloaded the data from the task website4, comprising 22,874 instances divided into training set (11,000) and test set (10,874). The labels of the GxG are perfectly balanced between M (male) and F (female).
32Following are two examples of instances from the GxG dataset with their label and translation:
@ElfoBruno no la barba la devo tenere lunga per sembrare folta perchè in realtà è rada...
@ElfoBruno no I have to keep the beard long to make it look thick because it really is patchy...
label: M
Sabato prossimo sono davvero curiosa di scoprire cosa farà @Valerio_Scanu a #BallandoConLeStelle
Next Saturday I am very curious to find out what @Valerio_Scanu will do at #DancingWithTheStars
label: F
Table 3: Results of the Gender Prediction.
Female | Male | Avg. | ||||||
Model | Acc. | Pr. | R. | F1 | Pr. | R. | F1 | F1 |
CNN | .511 | .507 | .879 | .643 | .543 | .143 | .227 | .435 |
CNN+W | .513 | .508 | .851 | .636 | .539 | .174 | .263 | .450 |
33Since this is a classification rather than a detection task, the process is slightly different from the previous experiment, to account for the symmetry between the labels. First, the polarized weirdness is computed on the training set twice, once on the texts written by males (against the women’s texts) and once on the texts written by females (against the men’s texts). Then the general Polyglot embeddings are adapted by applying the algorithm in Section 3.2 twice, in both directions, using the respective weirdness rankings. The adapted embeddings are used to initialize the CNN, resulting in the classification performance presented in Table 3. The overall performance improves when the adapted embeddings are included in the model. However, the classification of the male label improves while the classification of female does not, due to the difference in recall.
34Qualitative analysis reveals interesting patterns, confirming that strong bias is present in some pre-trained word embedding models. The twenty top weird words in the Male GxG set are: costituzionale (constitutional), socialisto (socialist), Lecce (name of a city and a football club), DALLA (name of a singer), utente (user), Samp (name of a football team), Sampdoria (same of a football team), Nera (black), allenatore (coach), Orlando (proper name), Bp (acronym), ni (yes and no), maresciallo (marshall), garanzia (guarantee), cerare (to wax), voluto (willing), pilotare (to pilot), disco (disco), caserma (barracks), From (proper name).
35The top twenty weird words in the Female GxG set are instead the following: qualcuna (someone (feminine)), HEART EMOJI, Qualcuna (someone (feminine)), KISS EMOJI, 83 (number), essi (them), leonessa (lioness), Sarah (proper name), 06 (number), HEART-EYED EMOJI, nervoso (nervous), James (proper name), Dante (proper name), coreografia (choreography), Strada (street), Fra (proper name), Chiama (call), en (en), bravissimi (very good (plural)), Moratti (proper name). Arguably, a stronger topic bias (football) is present in the male subset, possibly explaining the better performance induced by the adaptation.
5. Conclusion and Future Work
36In this work, we adapted an extension of the weirdness index to score the words in a labeled corpus according to how much they are typical of a given label. The polarized weirdness score is used to automatically adapt an existing word embedding space to better reflect target-specific semantic associations of words. We measured a performance boost on tasks of hate speech detection in English and Spanish, and gender prediction in Italian.
37On detection tasks, the improvement from our method is remarkable in terms of recall, indicating the potential of weirdness-adapted word embeddings to correct false negatives. This result is in line with the original motivation for this approach, i.e., to account for semantic shift occurring in domain-specific corpora of opinionated content. For instance, in the hate speech domain, the adapted embeddings are able to capture that certain neutral words (e.g., “wall”) assume a polarized connotation (e.g., negatively charged).
38The results from this study are promising, and encourage us to extend the method to richer representations (e.g., “weird” ngrams), languages other than European, and its integration into more sophisticated deep neural models. Recent Transformer models, in particular, compute contextualized embeddings, therefore including transformations similar to the present method. Although such models are less transparent with respect to such transformation, an experimental comparison is among the next steps planned in this research.
Bibliographie
Des DOI sont automatiquement ajoutés aux références bibliographiques par Bilbo, l’outil d’annotation bibliographique d’OpenEdition. Ces références bibliographiques peuvent être téléchargées dans les formats APA, Chicago et MLA.
Format
- APA
- Chicago
- MLA
Khurshid Ahmad, Lee Gillam, and Lena Tostevin. 1999. University of surrey participation in trec8: Weirdness indexing for logical document extrapolation and retrieval (wilder). In The Eighth Text RE-trieval Conference (TREC-8), Gaithersburg, Maryland, November.
Rami Al-Rfou, Bryan Perozzi, and Steven Skiena. 2013. “Polyglot: Distributed Word Representations for Multilingual Nlp.” In Proceedings of the Seventeenth Conference on Computational Natural Language Learning, 183–92.
Jeremy Barnes, Roman Klinger, and Sabine Schulte im Walde. 2018a. “Bilingual Sentiment Embeddings: Joint Projection of Sentiment Across Languages.” In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2483–93. Melbourne, Australia: Association for Computational Linguistics. http://aclweb.org/anthology/P18-1231.
Jeremy Barnes, Roman Klinger, and Sabine Schulte im Walde. 2018b. “Projecting Embeddings for Domain Adaption: Joint Modeling of Sentiment Analysis in Diverse Domains.” In Proceedings of the 27th International Conference on Computational Linguistics, 818–30. Santa Fe, New Mexico, USA: Association for Computational Linguistics. http://aclweb.org/anthology/C18-1070.
Valerio Basile, Cristina Bosco, Elisabetta Fersini, Debora Nozza, Viviana Patti, Francisco Rangel, Paolo Rosso, and Manuela Sanguinetti. 2019. “SemEval-2019 Task 5: Multilingual Detection of Hate Speech Against Immigrants and Women in Twitter.” In Proceedings of the 13th International Workshop on Semantic Evaluation (Semeval-2019). Minneapolis, Minnesota: Association for Computational Linguistics.
Cristina Bosco, Felice Dell’Orletta, Fabio Poletto, Manuela Sanguinetti, and Maurizio Tesconi. 2018. “Overview of the EVALITA 2018 Hate Speech Detection Task.” In Proceedings of the Sixth Evaluation Campaign of Natural Language Processing and Speech Tools for Italian. Final Workshop (EVALITA 2018) Co-Located with the Fifth Italian Conference on Computational Linguistics (Clic-It 2018), Turin, Italy, December 12-13, 2018. http://ceur-ws.org/Vol-2263/paper010.pdf.
Felice Dell’Orletta, and Malvina Nissim. 2018. “Overview of the EVALITA 2018 Cross-Genre Gender Prediction (Gxg) Task.” In Proceedings of the Sixth Evaluation Campaign of Natural Language Processing and Speech Tools for Italian. Final Workshop (EVALITA 2018) Co-Located with the Fifth Italian Conference on Computational Linguistics (Clic-It 2018), Turin, Italy, December 12-13, 2018. http://ceur-ws.org/Vol-2263/paper006.pdf.
Elisabetta Fersini, Paolo Rosso, and Maria Anzovino. 2018. “Overview of the Task on Automatic Misogyny Identification at Ibereval 2018.” In IberEval@SEPLN, 2150:214–28. CEUR Workshop Proceedings. CEUR-WS.org.
Komal Florio, Valerio Basile, Marco Polignano, Pierpaolo Basile, and Viviana Patti. 2020. “Time of Your Hate: The Challenge of Time in Hate Speech Detection on Social Media.” Applied Sciences 10 (12). https://0-doi-org.catalogue.libraries.london.ac.uk/10.3390/app10124180.
10.3390/app10124180 :Diederik P. Kingma, and Jimmy Ba. 2014. “Adam: A Method for Stochastic Optimization.” CoRR abs/1412.6980. http://dblp.uni-trier.de/db/journals/corr/corr1412.html#KingmaB14.
S. Kullback, and R. A. Leibler. 1951. “On Information and Sufficiency.” The Annals of Mathematical Statistics 22 (1): 79–86. http://0-www-jstor-org.catalogue.libraries.london.ac.uk/stable/2236703.
10.1214/aoms/1177729694 :Ted Pedersen. 2010. “Information Content Measures of Semantic Similarity Perform Better Without Sense-Tagged Text.” In Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, 329–32. Los Angeles, California: Association for Computational Linguistics. https://www.aclweb.org/anthology/N10-1047.
Matthew Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep contextualized word representations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 2227–2237, New Orleans, Louisiana, June. Association for Computational Linguistics.
Notes de bas de page
2 https://competitions.codalab.org/competitions/19935
3 To speed up to computation without major loss of information, we consider only the top 2,000 items from the weirdness ranking.
Auteur
University of Turin – valerio.basile@unito.it
Le texte seul est utilisable sous licence Licence OpenEdition Books. Les autres éléments (illustrations, fichiers annexes importés) sont « Tous droits réservés », sauf mention contraire.
Proceedings of the Second Italian Conference on Computational Linguistics CLiC-it 2015
3-4 December 2015, Trento
Cristina Bosco, Sara Tonelli et Fabio Massimo Zanzotto (dir.)
2015
Proceedings of the Third Italian Conference on Computational Linguistics CLiC-it 2016
5-6 December 2016, Napoli
Anna Corazza, Simonetta Montemagni et Giovanni Semeraro (dir.)
2016
EVALITA. Evaluation of NLP and Speech Tools for Italian
Proceedings of the Final Workshop 7 December 2016, Naples
Pierpaolo Basile, Franco Cutugno, Malvina Nissim et al. (dir.)
2016
Proceedings of the Fourth Italian Conference on Computational Linguistics CLiC-it 2017
11-12 December 2017, Rome
Roberto Basili, Malvina Nissim et Giorgio Satta (dir.)
2017
Proceedings of the Fifth Italian Conference on Computational Linguistics CLiC-it 2018
10-12 December 2018, Torino
Elena Cabrio, Alessandro Mazzei et Fabio Tamburini (dir.)
2018
EVALITA Evaluation of NLP and Speech Tools for Italian
Proceedings of the Final Workshop 12-13 December 2018, Naples
Tommaso Caselli, Nicole Novielli, Viviana Patti et al. (dir.)
2018
EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020
Proceedings of the Seventh Evaluation Campaign of Natural Language Processing and Speech Tools for Italian Final Workshop
Valerio Basile, Danilo Croce, Maria Maro et al. (dir.)
2020
Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020
Bologna, Italy, March 1-3, 2021
Felice Dell'Orletta, Johanna Monti et Fabio Tamburini (dir.)
2020
Proceedings of the Eighth Italian Conference on Computational Linguistics CliC-it 2021
Milan, Italy, 26-28 January, 2022
Elisabetta Fersini, Marco Passarotti et Viviana Patti (dir.)
2022