
To address this challenge, we introduce the contrastive learning strategy where given ‘positive’ and ‘negative’ sets of documents, we generate a question that is closely related to the ‘positive’ set but is far away from the ‘negative’ set. However, a naive model trained only using the targeted (‘positive’) document set may generate too generic questions that cover a larger scope than delineated by the document set. Such a model is useful in generating clarifying options. Multi-document question generation focuses on generating a question that covers the common aspect of multiple documents. Empirical results show that our method consistently performs similar to or better than several alternative state-of-the-art approaches.Ĭontrastive Multi-document Question Generation

We test our approach, which we term EMAP or Embeddings by Manifold Approximation and Projection, on six publicly available text-classification datasets of varying size and complexity. To delineate such neighbourhoods we experiment with several set-distance metrics, including the recently proposed Word Mover’s distance, while the fixed-dimensional projection is achieved by employing a scalable and efficient manifold approximation method rooted in topological data analysis. In this work we propose a novel technique to generate sentence-embeddings in an unsupervised fashion by projecting the sentences onto a fixed-dimensional manifold with the objective of preserving local neighbourhoods in the original space. Such methods are of varying complexity, from simple weighted-averages of word vectors to complex language-models based on bidirectional transformers.

The concept of unsupervised universal sentence encoders has gained traction recently, wherein pre-trained models generate effective task-agnostic fixed-dimensional representations for phrases, sentences and paragraphs. Proceedings of the Eleventh Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis 31 papers.Proceedings of the Sixth Arabic Natural Language Processing Workshop 54 papers.Proceedings of the Eighth Workshop on NLP for Similar Languages, Varieties and Dialects 17 papers.Proceedings of the First Workshop on Language Technology for Equality, Diversity and Inclusion 31 papers.Proceedings of the 12th International Workshop on Health Text Mining and Information Analysis 12 papers.Proceedings of the Third Workshop on Beyond Vision and LANguage: inTEgrating Real-world kNowledge (LANTERN) 6 papers.Proceedings of the Workshop on Human Evaluation of NLP Systems (HumEval) 16 papers.

Proceedings of the First Workshop on Bridging Human–Computer Interaction and Natural Language Processing 19 papers.Proceedings of the EACL Hackashop on News Media Content Analysis and Automated Report Generation 20 papers.Proceedings of the 11th Global Wordnet Conference 35 papers.Proceedings of the First Workshop on Speech and Language Technologies for Dravidian Languages 54 papers.Proceedings of the 8th Workshop on Balto-Slavic Natural Language Processing 16 papers.Proceedings of the 16th Workshop on Innovative Use of NLP for Building Educational Applications 24 papers.Proceedings of the Second Workshop on Domain Adaptation for NLP 27 papers.Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Tutorial Abstracts 6 papers.Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Student Research Workshop 27 papers.

Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations 40 papers.Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume 327 papers.
