2021-11-05

Example Sentences:

  • Yong Zhuang: Multiple correlation analysis (MRC) is a highly general and therefore very flexible data-analytic system that may be used whenever a quantitative variable (the dependent variable) is to be studied as a function of, or in relation­ship to, any factors of interest (expressed as independent variables).source
  • Tianyu Kan: Furthermore, we demonstrate through a series of empirical results that our approach allows for a smooth tradeoff between memorization and generalization and exhibits some of the most salient characteristics of neural networks: depth improves performance; random data can be memorized and yet there is generalization on real data; and memorizing random data is harder in a certain sense than memorizing real data. source

Before & After:

  • Hefei Qiu
    • before: In the way of constructing positive pairs, different from previous methods of using data augmentation which are usually sophisticated or uninterpretable, we apply semantic similarity or entailment relation of two sentences which is easy to implement and easy to understand.

    • after: In the way of construct positive sentence pairs, we directly utilize the sentence similarity score or sentence entailment relation in the existing well-recognized public corpora. This approach is different from previous methods of augmenting data by translating a sentence in source language to a target language and then translate it back to the source language to generate a positive instance, or by generating a new sentence embedding which is close to the original one but without knowing what the sentence is in human language.

  • Tianyu Kang
    • before: There must be a distribution shift between training and testing sample sets, otherwise they should have the same performance.

    • after: There is no panacea of generalization: the distribution shift between training and testing sample sets always exists, otherwise they should performed same.

  • Yong Zhuang
    • before: Streaming feature selection has always been considered a superior technique for selecting the relevant subset features from highly dimensional data and thus reducing learning complexity.

    • after: Streaming feature selection has always been a superior technique to interweave the feature generation process with the feature testing process to avoid generating features that are unlikely to be helpful.