祝贺实验室刘天元博士的论文被人工智能领域旗舰会议AAAI 2023录用

实验室刘天元博士为第一作者的论文“Unsupervised Paraphrasing under Syntax Knowledge”（作者：刘天元,孙宇清,吴佳琪,徐熙,韩雨辰,李成,龚斌）被人工智能领域旗舰会议AAAI 2023录用。

AAAI 2023会议即第37届AAAI人工智能大会（The 37th AAAI Conference on Artificial Intelligence, AAAI 2023），计划于2023年2月7日-14日于马美国华盛顿举行。AAAI的全称是人工智能促进协会（Association for the Advancement of Artificial Intelligence）,是人工智能领域的主要学术组织, 汇集了全球顶尖人工智能领域专家学者，主办的AAAI会议是人工智能研究领域的旗舰会议，也是CCF推荐的人工智能方向A类会议。

该论文主要内容如下：

Unsupervised Paraphrasing under Syntax Knowledge

Abstract—The soundness of syntax is an important consideration for the paraphrase generation task. Most methods control the syntax of paraphrases by embedding the syntax and semantics as latent information in the generation process such that they cannot guarantee the syntactical correctness of results. Different from them, in this paper we investigate the structural patterns of word usages termed as the word composable knowledge and integrate it into the paraphrase generation to control the syntax in an explicit way. This syntax knowledge is learned by pretraining on a large corpus with the dependency relationships and formed as the probabilistic function on word-level syntactical soundness. For sentence-level correctness, we design a hierarchical syntax structure loss function, which quantifies the dependency relations in the whole paraphrase against the given template. Thus, the generation process can select the appropriate words taking into account both the semantics and the syntax. The proposed method is evaluated on a few paraphrase datasets. The experimental results show that the quality of paraphrases by our proposed method outperforms the comparison methods, especially in terms of syntax correctness.

基于句法知识的无监督文本改写

摘要——在文本改写生成任务中，句法的合理性是一个重要的考量。多数现有文本改写方法在使用隐含向量的方式控制句法和语义，无法保证结果的句法合理性。本文中我们探究词汇语用时的句法结构模式形成词汇组合知识，并将之整合到文本改写生成过程中，以显式的方式控制生成内容的句法。该组合知识通过建模词汇间的依存句法关系，形成词汇级句法合理性的估计函数。在文本改写过程中，通过一个层次化的句法结构损失函数，来量化所生成的句子是否满足给定的句法结构，以确保整句级别的句法合理性。以上方式使得生成过程能够正确考虑语义和句法两方面来选择正确的词汇。本文提出的方法在多个改写数据集上进行了测试。实验结果表明本文方法所生成的改写句优于对比方法，在句法正确性上表现尤为突出。