pytorch-crf使用小结

pytorch-crf包API
例子
- Getting started
- Computing log likelihood
- Decoding

pytorch-crf包提供了一个CRF层的PyTorch版本实现，我们在做NER任务时可以很方便地利用这个库，而不必自己单独去实现。

pytorch-crf包API

class torchcrf.CRF(num_tags, batch_first=False)

This module implements a conditional random field.

The forward computation of this class computes the log likelihood of the given sequence of tags and emission score tensor.
This class also has decode method which finds the best tag sequence given an emission score tensor using Viterbi algorithm.

Parameters:
- num_tags (int) – Number of tags.
- batch_first (bool) – Whether the first dimension corresponds to the size of a minibatch.
start_transitions:Start transition score tensor of size (num_tags,).
Type:Parameter
end_transitions:End transition score tensor of size (num_tags,).
Type:Parameter
transitions:Transition score tensor of size (num_tags, num_tags).
Type:Parameter
decode(emissions, mask=None):Find the most likely tag sequence using Viterbi algorithm.
- Parameters:
  - emissions (Tensor) – Emission score tensor of size (seq_length, batch_size, num_tags) if batch_first is False, (batch_size, seq_length, num_tags)otherwise.
  - mask (ByteTensor) – Mask tensor of size (seq_length, batch_size) if batch_first is False, (batch_size, seq_length) otherwise.
- Return type:List[List[int]]
- Returns:List of list containing the best tag sequence for each batch.
forward(emissions, tags, mask=None, reduction='sum'):Compute the conditional log likelihood of a sequence of tags given emission scores.
- Parameters:
  - emissions (Tensor) – Emission score tensor of size (seq_length, batch_size, num_tags) if batch_first is False, (batch_size, seq_length, num_tags) otherwise.
  - tags (LongTensor) – Sequence of tags tensor of size (seq_length, batch_size) if batch_first is False, (batch_size, seq_length) otherwise.
  - mask (ByteTensor) – Mask tensor of size (seq_length, batch_size) if batch_first is False, (batch_size, seq_length) otherwise.
  - reduction (str) – Specifies the reduction to apply to the output: none|sum|mean|token_mean. none: no reduction will be applied. sum: the output will be summed over batches. mean: the output will be averaged over batches. token_mean: the output will be averaged over tokens.
- Returns:The log likelihood. This will have size (batch_size,) if reduction is none, () otherwise.
- Return type:Tensor
reset_parameters():Initialize the transition parameters.The parameters will be initialized randomly from a uniform distribution between -0.1 and 0.1.
- Return type:None

例子

Getting started

pytorch-crf中的CRF类继承自PyTorch的nn.Module，这个类提供了一个CRF层的实现。

>>> import torch
>>> from torchcrf import CRF
>>> num_tags = 5  # number of tags is 5
>>> model = CRF(num_tags)

Computing log likelihood

一旦创建了CRF类，我们可以计算在给定mission scores的情况下，一个标注序列的对数似然。

>>> seq_length = 3  # maximum sequence length in a batch
>>> batch_size = 2  # number of samples in the batch
>>> emissions = torch.randn(seq_length, batch_size, num_tags)
>>> tags = torch.tensor([
...   [0, 1], [2, 4], [3, 1]
... ], dtype=torch.long)  # (seq_length, batch_size)
>>> model(emissions, tags)
tensor(-12.7431, grad_fn=)

假如在你的输入张量中有一些填充，你可以传进去一个mask掩码张量。

>>> # mask size is (seq_length, batch_size)
>>> # the last sample has length of 1
>>> mask = torch.tensor([
...   [1, 1], [1, 1], [1, 0]
... ], dtype=torch.uint8)
>>> model(emissions, tags, mask=mask)
tensor(-10.8390, grad_fn=)

注意到这个返回值为对数似然，所以当你作为损失函数时，需要在这个值前添加负号。默认地，这个对数似然是批上的求和。对于其它的选项，你可以查询CRF.forward的API文档。

Decoding

为了获得最可能的句子标注序列，可以使用CRF.decode方法。

>>> model.decode(emissions)
[[3, 1, 3], [0, 1, 0]]

这个方法也接受一个mask掩码张量，详情可以查看CRF.decode。

NLP

pytorch-crf使用小结

pytorch-crf包API

例子

Getting started

Computing log likelihood

Decoding

相关

NLP的前世今生

安装pyhanlp报错，之解决方案

NLP中的预训练语言模型（三）—— XL-Net和Transformer-XL

5个很少被提到但能提高NLP工作效率的Python库

中文分词工具探析（一）：ICTCLAS (NLPIR)

NLP（四十一）：解决样本不均衡FocalLoss与GHM

中文自然语言处理(NLP)(三)运用python jieba模块计算知识点当中关键词的词频

(转载)深入理解NLP Subword算法：BPE、WordPiece、ULM

NLP学习参考

Elasticsearch应用Hanlp中文分词和Pinyin 拼音分词

NLP面试整理

CBLUE 中文医疗信息处理基准 NLP (论文复现流程)

标签