2022-01-21

Example Sentences:

  • Chengjie Zheng: Designing proper self-supervised-learning principles for GNNs is crucial, as they drive what information of graph-structured data will be captured by GNNs and may heavily impact their performance in downstream tasks.source

  • Yong Zhuang: Given a long time series X and a look-back window of fixed length T, at timestamp t, time series forecasting is to predict $\hat{X}{t+1:t+\gamma} = {x{t+1}; :::; x_{t+\gamma}}$ based on the past T steps $X_{t-T+1:t} = {x_{t-T+1}; :::; x_t}$. Here, $\gamma$ is the length of the forecast horizon, $x_t \in R^d$ is the value at time step t, and d is the number of variates. For simplicity, in the following we will omit the subscripts, and use $X$ and $\hat{X}$ to represent the historical data and the prediction, respectively. When $\gamma > 1$, we can either directly optimize the multi-step forecasting objective (direct multi-step (DMS) estimation), or learn a single-step forecaster and iteratively apply it to get multi-step predictions (iterated multi-step (IMS) estimation). source

  • Hefei Qiu: Constructed entirely from standard ConvNet modules, ConvNeXts compete favorably with Transformers in terms of accuracy and scalability, achieving 87.8% ImageNet top-1 accuracy and outperforming Swin Transformers on COCO detection and ADE20K segmentation, while maintaining the simplicity and efficiency of standard ConvNets. source

  • Tianyu Kang: To address these challenges, we seek models that perform well on each sub-corpus rather than those that achieve good (average) performance by focusing on the easy examples and domains. Thus, in this paper we develop procedures that control performance over all large enough subpopulations, agnostic to the distribution of each subpopulation.source

  • Zihan Li: Unlike the word tokens that serve as the basic elements of processing in language Transformers, visual elements can vary substantially in scale, a problem that receives attention in tasks such as object detection.source

Before & After:

  • Chengjie Zheng
    • before: The purpose of data augmentation is to achieve reasonable data representation without affecting the semantics label.

    • after: Data augmentation aims at creating novel and realistically rational data through applying certain transformations without affecting the semantics label.

  • Tianyu Kang
    • before: We adopt the standard episodic training procedure, and sample a set of N-way K-shot training tasks to mimic the N-way K-shot test setting. In all settings, the query number is set to 15 and the performance is averaged over 10,000 randomly generated episodes from the test set. We construct these episodes by randomly sampling images instead of just dividing images into different episodes. Hence, an image could be processed in the model more than once. Due to the large number of episodes, this should not cause any bias to the model.

    • after: We adopt the standard episodic training procedure, and sample a set of N-way K-shot training tasks to mimic the N-way K-shot test setting. The query number is set to 15 and the performance is averaged over 10,000 randomly generated episodes from the test set. Furthermore, we demonstrate through a series of empirical results that episodes are constructed by randomly sampling images instead of just dividing images into different episodes: allowing an image be processed more than once; due to the large number of episodes, no bias is caused.

  • Zihan Li
    • In this paper, we seek to expand the applicability of Transformer such that it can serve as a general-purpose backbone for computer vision, as it does for NLP and as CNNs do in vision.source