祝贺实验室谢翌博士的论文被国际数据库顶级会议ICDE 2022录用
实验室谢翌博士为第一作者的论文“Subspace Embedding Based New Paper Recommendation”(作者:谢翌,李稳,孙宇清,Elisa Bertino)被国际数据库顶级会议ICDE 2022录用。
ICDE 2022会议即第38届IEEE国际数据工程大会(The 38th IEEE International Conference On Data Engineering , ICDE 2022),计划于2022年5月9日-12日于马来西亚吉隆坡以线上会议形式召开。ICDE是数据库领域最权威的国际顶级学术会议之一,是CCF推荐的A类会议,与SIGMOD、VLDB并称为数据库领域的国际三大顶会。
Abstract—As huge numbers of academic papers are published every year,it is critical to be able to recommend high quality papers. The typical evaluation method for papers is to use citation information,which however is not applicable to new papers. To address such a shortcoming,in this paper,we consider a novel perspective on the association between the content difference of a paper,with respect to other papers, and its innovation. Since innovation has often domain-specific characteristics and forms, we introduce the concept of subspaceto describe the commonly recognized aspects of paper contents, namely background, methods and results. A set of expert rules are formalized to annotate the differences between papers, based on which a twin-network is proposed for learning the embeddings of papers in different subspaces. A series of empirical studies show that there are clear correlations between a paper influence and its difference with others in those subspaces. The results also show the characteristics of innovation in different scientific disciplines. To take into account information about academic networks for paper recommendation, we propose a graph convolutional neural method to combine the paper content with other related elements, where user interests and academic influences are modeled asymmetric. Experimental results on real datasets show that our method is more effective than other baseline methods for new paper recommendation. We also discuss the characteristics of scientific disciplines and authors to show the effectiveness of modeling the asymmetric user interests and influences. Finally,we verify the reusability of our method on a patent dataset. The results show that it is also applicable to academic data with low-resource features.
摘要——由于每年都有大量的学术论文发表,因此能够推荐高质量的论文是至关重要的。论文评价的典型方法是利用文献信息,但这并不适用于论文。针对这一缺陷,本文从一个新颖的视角来分析一篇论文与其他论文的内容差异与其创新之处。由于创新往往具有特定领域的特征和形式,我们引入了子空间的概念来描述论文内容中公认的方面,即背景、方法和结果。通过形式化一组专家规则来标注论文之间的差异,并在此基础上提出了一个孪生网络来学习论文在不同子空间中的嵌入情况。一系列的实证研究表明,在这些子空间中,论文影响力与其差异之间存在着明显的相关性。研究结果还显示了不同学科创新的特点。为了考虑学术网络信息对论文推荐的影响,我们提出了图卷积神经网络方法,将论文内容与其他相关元素相结合,采用用户兴趣和学术影响力不对称的建模方法。在真实数据集上的实验结果表明,该方法比其他基线方法更有效。为了展示本文方法建模非对称的用户兴趣和学术影响力的必要性,我们针对学科分类和专家特征做了进一步分析。最后,在专利数据集上验证了该方法的可重用性。结果表明,该方法同样适用于低资源特征的学术数据。