1. Jul 2019
    1. YOLOv2 [35] is proposed to improveYOLO in several aspects,i.e., add batch normalization onall convolution layers, use high resolution classifier, useconvolution layers with anchor boxes to predict boundingboxes instead of the fully connected layers, etc.
  2. Jun 2019
    1. Similarto SSD [29], we use hard negative mining to mitigate theextreme foreground-background class imbalance
  3. May 2019
    1. 使用sliced 推土机距离来正则化编码训练样本和预定义样本分布的自编码器距离。

    1. FaceFeat-GAN: a Two-Stage Approach for Identity-Preserving Face Synthesis

      第一阶段训练数据的属性和表情特征,第二阶段生成图片

    1. Attention Augmented Convolutional Networks

      参数量增加很少,但计算量增加很多,因此对模型的速度有很大影响。(耗时超过10倍)

    1. 大体框架就是,每个单词的词向量经过三个不同的 W 矩阵变换之后得到了三个不同的向量表示 q, k, v ,分别拼接起来组成了矩阵 Q, K, V ,其中 Q, K 相乘就得到了任意两个单词之间的相似度矩阵,然后对矩阵每一行进行softmax就得到了每一个单词对于其他所有单词的权重。再乘上矩阵 V 就得到了它对其他所有单词的加权求和,以此来作为它的向量表示。

    1. sis the vector representation of the sentence learned by asequential LSTM.

      s是用seq LSTM计算出的当前句子的向量表示

    2. 介绍了Tree LSTM与标准的序列LSTM之间的区别 介绍了两种Tree LSTM单元,以及他们适用的范围

    3. ncompatibility of standard LSTM and Tree structured dat

      标准LSTM和Tree LSTM的不同,在不同任务上的效果。

  4. Apr 2019
    1. eature maps in different levels generated bypyramid pooling were finally flattened and concatenated tobe fed into a fully connected layer for classification.
    1. The construction of our pyramid in-volves a bottom-up pathway, a top-down pathway, and lat-eral connections, as introduced in the following
  5. Mar 2019
    1. 如果有标注的数据集,有gold programs,就可以直接optimize their likelihood。但是没有gold programs,我们 就采用迭代程序,类似于EM算法。

    2. 强化学习的训练经常收敛缓慢,并且容易陷入局部最优。因为搜索空间较大,而reward很稀疏。

    3. 一个seq2seq结构的semantic parser + 一个可以执行的程序 运用强化学习+弱监督学习,可迭代的最大似然估计 数据集:WEBQUESTIONSSP,从问答对中进行训练

    1. For the problem of automatic music transcription, the inputtime-frequency representation forms the input sequencex

      按时间序列的输入 转化成了 输入序列 x

  6. Feb 2019
  7. www.aixpaper.com www.aixpaper.com
    1. 分为两部分,先将语句转换到un-grounded中间表示,谓词(非终结符)+名词(终结符),然后再进行ground,变成领域内的逻辑形式。 转换到中间谓词表示时,用一个转换系统,通过预测action来完成。

    2. Our model falls under the second classof approaches where utterances are first mappedto an intermediate representation containing nat-ural language predicates.

      先转换到一个中间表示,中间表示中含有自然语言谓词结构

    1. 首先,source和target的树都进行二叉化. 然后,encoder部分就是自底向上的tree-ltsm进行编码,在root端得到编码. decoder部分,维护一个待扩展节点的队列.每个节点的h和c由父节点的h和c进行计算出来. attention机制 父节点的attention也feed到子节点中,而不是完全独立的.

    2. when the decoder expands a non-terminal, it locates the correspondingsub-tree in the source tree using an attention mechanism, and uses the information of the sub-tree toguide the non-terminal expansion.

      在对一个非终结符进行解码的时候,就可以通过attention来定位与他关联的源树里的子树.

    3. programming languages have rigorous grammars and are not tolerant totypos and grammatical mistakes

      与自然语言的翻译不同.编程语言具有严格的语法,并且不能容忍自己的语法和语法错误

    4. 编程语言的翻译,例如C++->Python.用树到树的结构,加上了attention机制.

    1. control the size of rewritings, this paper onlyrewrite the common nouns in a sentence

      控制重写的量,所以只替换句子中的通用名词。

    2. ranking function

      排序功能,通过把重写特征和语义解析特征都考虑进去,选出最佳的重写句和正确的LF。

    3. ne is a dictionary-based sen-tence rewriting algorithm, which can resolve the1-N mismatch problem by rewriting a word us-ing its explanation in a dictionary.The otheris a template-based sentence rewriting algorithm,which can resolve the N-1 mismatch problemby rewriting complicated expressions using para-phrase template pairs.

      针对1-N:基于词典的语句重写算法。用单词的示意代替该自然语言单词。 针对N-1:基于模板的语句重写算法。用复述模板对来重写复杂的表达。

    4. 基于词典的替换时,只对句子中的常用名词进行替换。并且对该名词用最常用的词典解释替换,忽略掉长度超过5的释义。

    5. 提出两种词汇不匹配的类型:

      1. 1-N:一个自然语言单词对应多个公式组成的公式
      2. N-1:多个自然语言单词对应一个逻辑形式中的单词
    6. 自然语言的多样性,导致多条语句可能有相同的意思。这时候,parser只能将语句转换成其结构相同的LF,而很难转换成结构不同的LF。 所以文章的方法是通过重写语句,例如daughter写成female child,这样转换的LF更准确。

    1. Applying distillation techniques to multi-class object detection, in contrast to image classification, ischallenging for several reasons

      将知识蒸馏应用于目标检测有多个挑战.

      1. 目标检测网络对压缩更敏感;
      2. 知识蒸馏是应用于假设各类同等重要,而检测的背景则是大量的.
  8. Jan 2019
    1. 提出语义分割分枝来更好的利用检测监督信息,同时应用自监督的注意力机制。

    2. 语义分割分枝利用检测的监督信息来直接学习模型的人脸区域特征。

    3. 该网络引入特征融合金字塔和分割分枝

    1. H

      两个创新点,1:Agglomeration connection模块对多尺度特征进行聚合,从而处理人脸检测的尺度不变性;2:Hierarchical loss来知道训练。

    2. 感受野范围对小人脸有极大地影响。

    3. 框架核心是:通过聚合不通尺度的高层语义特征作为上下文线索来增强低层特征来显示利用CNN网络的多尺度特征。

    1. paper presents a Neural Aggregation Network(NAN) for video face recognition. The network takes a facevideo or face image set of a person with a variable num-ber of face images as its i

      讲得好

  9. Dec 2018
    1. in our case, with a per-pixelsigmoidand abinaryloss, they do not. We show by experiments that this formu-lation is key for good instance segmentation results

      类别之间不会冲突

    Annotators

    URL

    1. visible and unoccluded in the image. Varolet al. ex-plore predicting a voxel representation of human body [45].In this work we go beyond these approaches by proposing amethod that can predict shape and pose from a single image,as well as how

      good

    1. ficult: perspective ambiguities make theloss function noisy and training data is scarce. In thispaper, we propose a novel approach (Neural Body Fitting(NBF)). It integrates a statistical body model within a CNN,leveraging reliabl

      good

  10. Nov 2018
    1. 视觉蕴涵视觉语言学习任务

    2. Visual Entailment Task for Visually-GroundedLanguage Learning

      视觉蕴涵视觉语言学习任务

    1. Deep convolutional neural networks [21, 23] revolution-ized computer vision arguably due to the discovery that fea-ture representations learned on apre-trainingtask can trans-fer useful info
    2. Detection from scratch.Before the prevalence of th

      good