fastNLP.models.bert¶

fastNLP提供了BERT应用到五个下游任务的模型代码，可以直接调用。这五个任务分别为

文本分类任务： BertForSequenceClassification

Matching任务： BertForSentenceMatching

多选任务： BertForMultipleChoice

序列标注任务： BertForTokenClassification

抽取式QA任务： BertForQuestionAnswering

每一个模型必须要传入一个名字为 embed 的 fastNLP.embeddings.BertEmbedding ，这个参数包含了 fastNLP.modules.encoder.BertModel ，是下游模型的编码器(encoder)。

除此以外，还需要传入一个数字，这个数字在不同下游任务模型上的意义如下:

下游任务模型                     参数名称      含义
BertForSequenceClassification  num_labels  文本分类类别数目，默认值为2
BertForSentenceMatching        num_labels  Matching任务类别数目，默认值为2
BertForMultipleChoice          num_choices 多选任务选项数目，默认值为2
BertForTokenClassification     num_labels  序列标注标签数目，无默认值
BertForQuestionAnswering       num_labels  抽取式QA列数，默认值为2(即第一列为start_span, 第二列为end_span)

最后还可以传入dropout的大小，默认值为0.1。

class fastNLP.models.bert.BertForSequenceClassification(embed: fastNLP.embeddings.bert_embedding.BertEmbedding, num_labels: int = 2, dropout=0.1)[源代码]¶

基类 fastNLP.models.BaseModel

别名 fastNLP.models.BertForSequenceClassification fastNLP.models.bert.BertForSequenceClassification

BERT model for classification.

__init__(embed: fastNLP.embeddings.bert_embedding.BertEmbedding, num_labels: int = 2, dropout=0.1)[源代码]¶

参数:	embed (fastNLP.embeddings.BertEmbedding) -- 下游模型的编码器(encoder). num_labels (int) -- 文本分类类别数目，默认值为2. dropout (float) -- dropout的大小，默认值为0.1.

forward(words)[源代码]¶

参数:	words (torch.LongTensor) -- [batch_size, seq_len]
返回:	{ `fastNLP.Const.OUTPUT` : logits}: torch.Tensor [batch_size, num_labels]

predict(words)[源代码]¶

参数:	words (torch.LongTensor) -- [batch_size, seq_len]
返回:	{ `fastNLP.Const.OUTPUT` : logits}: torch.LongTensor [batch_size]

class fastNLP.models.bert.BertForSentenceMatching(embed: fastNLP.embeddings.bert_embedding.BertEmbedding, num_labels: int = 2, dropout=0.1)[源代码]¶

基类 fastNLP.models.BaseModel

别名 fastNLP.models.BertForSentenceMatching fastNLP.models.bert.BertForSentenceMatching

BERT model for sentence matching.

__init__(embed: fastNLP.embeddings.bert_embedding.BertEmbedding, num_labels: int = 2, dropout=0.1)[源代码]¶

参数:	embed (fastNLP.embeddings.BertEmbedding) -- 下游模型的编码器(encoder). num_labels (int) -- Matching任务类别数目，默认值为2. dropout (float) -- dropout的大小，默认值为0.1.

forward(words)[源代码]¶

参数:	words (torch.LongTensor) -- [batch_size, seq_len]
返回:	{ `fastNLP.Const.OUTPUT` : logits}: torch.Tensor [batch_size, num_labels]

predict(words)[源代码]¶

参数:	words (torch.LongTensor) -- [batch_size, seq_len]
返回:	{ `fastNLP.Const.OUTPUT` : logits}: torch.LongTensor [batch_size]

class fastNLP.models.bert.BertForMultipleChoice(embed: fastNLP.embeddings.bert_embedding.BertEmbedding, num_choices=2, dropout=0.1)[源代码]¶

基类 fastNLP.models.BaseModel

别名 fastNLP.models.BertForMultipleChoice fastNLP.models.bert.BertForMultipleChoice

BERT model for multiple choice.

__init__(embed: fastNLP.embeddings.bert_embedding.BertEmbedding, num_choices=2, dropout=0.1)[源代码]¶

参数:	embed (fastNLP.embeddings.BertEmbedding) -- 下游模型的编码器(encoder). num_choices (int) -- 多选任务选项数目，默认值为2. dropout (float) -- dropout的大小，默认值为0.1.

forward(words)[源代码]¶

参数:	words (torch.LongTensor) -- [batch_size, num_choices, seq_len]
返回:	{ `fastNLP.Const.OUTPUT` : logits}: torch.LongTensor [batch_size, num_choices]

predict(words)[源代码]¶

参数:	words (torch.LongTensor) -- [batch_size, num_choices, seq_len]
返回:	{ `fastNLP.Const.OUTPUT` : logits}: torch.LongTensor [batch_size]

class fastNLP.models.bert.BertForTokenClassification(embed: fastNLP.embeddings.bert_embedding.BertEmbedding, num_labels, dropout=0.1)[源代码]¶

基类 fastNLP.models.BaseModel

别名 fastNLP.models.BertForTokenClassification fastNLP.models.bert.BertForTokenClassification

BERT model for token classification.

__init__(embed: fastNLP.embeddings.bert_embedding.BertEmbedding, num_labels, dropout=0.1)[源代码]¶

参数:	embed (fastNLP.embeddings.BertEmbedding) -- 下游模型的编码器(encoder). num_labels (int) -- 序列标注标签数目，无默认值. dropout (float) -- dropout的大小，默认值为0.1.

forward(words)[源代码]¶

参数:	words (torch.LongTensor) -- [batch_size, seq_len]
返回:	{ `fastNLP.Const.OUTPUT` : logits}: torch.Tensor [batch_size, seq_len, num_labels]

predict(words)[源代码]¶

参数:	words (torch.LongTensor) -- [batch_size, seq_len]
返回:	{ `fastNLP.Const.OUTPUT` : logits}: torch.LongTensor [batch_size, seq_len]

class fastNLP.models.bert.BertForQuestionAnswering(embed: fastNLP.embeddings.bert_embedding.BertEmbedding)[源代码]¶

基类 fastNLP.models.BaseModel

别名 fastNLP.models.BertForQuestionAnswering fastNLP.models.bert.BertForQuestionAnswering

用于做Q&A的Bert模型，如果是Squad2.0请将BertEmbedding的include_cls_sep设置为True，Squad1.0或CMRC则设置为False

__init__(embed: fastNLP.embeddings.bert_embedding.BertEmbedding)[源代码]¶

参数:	embed (fastNLP.embeddings.BertEmbedding) -- 下游模型的编码器(encoder). num_labels (int) -- 抽取式QA列数，默认值为2(即第一列为start_span, 第二列为end_span).

forward(words)[源代码]¶

参数:	words (torch.LongTensor) -- [batch_size, seq_len]
返回:	一个包含num_labels个logit的dict，每一个logit的形状都是[batch_size, seq_len + 2]