pytorch-crf使用小结


目录
  • pytorch-crf包API
  • 例子
    • Getting started
    • Computing log likelihood
    • Decoding

pytorch-crf包提供了一个CRF层的PyTorch版本实现,我们在做NER任务时可以很方便地利用这个库,而不必自己单独去实现。

pytorch-crf包API

class torchcrf.CRF(num_tags, batch_first=False)

This module implements a conditional random field.

  • The forward computation of this class computes the log likelihood of the given sequence of tags and emission score tensor.
  • This class also has decode method which finds the best tag sequence given an emission score tensor using Viterbi algorithm.

  • Parameters:

    • num_tags (int) – Number of tags.
    • batch_first (bool) – Whether the first dimension corresponds to the size of a minibatch.
  • start_transitions:Start transition score tensor of size (num_tags,).
    Type:Parameter

  • end_transitions:End transition score tensor of size (num_tags,).
    Type:Parameter

  • transitions:Transition score tensor of size (num_tags, num_tags).
    Type:Parameter

  • decode(emissions, mask=None):Find the most likely tag sequence using Viterbi algorithm.

    • Parameters:
      • emissions (Tensor) – Emission score tensor of size (seq_length, batch_size, num_tags) if batch_first is False, (batch_size, seq_length, num_tags)otherwise.
      • mask (ByteTensor) – Mask tensor of size (seq_length, batch_size) if batch_first is False, (batch_size, seq_length) otherwise.
    • Return type:List[List[int]]
    • Returns:List of list containing the best tag sequence for each batch.
  • forward(emissions, tags, mask=None, reduction='sum'):Compute the conditional log likelihood of a sequence of tags given emission scores.

    • Parameters:
      • emissions (Tensor) – Emission score tensor of size (seq_length, batch_size, num_tags) if batch_first is False, (batch_size, seq_length, num_tags) otherwise.
      • tags (LongTensor) – Sequence of tags tensor of size (seq_length, batch_size) if batch_first is False, (batch_size, seq_length) otherwise.
      • mask (ByteTensor) – Mask tensor of size (seq_length, batch_size) if batch_first is False, (batch_size, seq_length) otherwise.
      • reduction (str) – Specifies the reduction to apply to the output: none|sum|mean|token_mean. none: no reduction will be applied. sum: the output will be summed over batches. mean: the output will be averaged over batches. token_mean: the output will be averaged over tokens.
    • Returns:The log likelihood. This will have size (batch_size,) if reduction is none, () otherwise.
    • Return type:Tensor
  • reset_parameters():Initialize the transition parameters.The parameters will be initialized randomly from a uniform distribution between -0.1 and 0.1.

    • Return type:None

例子

Getting started

pytorch-crf中的CRF类继承自PyTorch的nn.Module,这个类提供了一个CRF层的实现。

>>> import torch
>>> from torchcrf import CRF
>>> num_tags = 5  # number of tags is 5
>>> model = CRF(num_tags)

Computing log likelihood

一旦创建了CRF类,我们可以计算在给定mission scores的情况下,一个标注序列的对数似然。

>>> seq_length = 3  # maximum sequence length in a batch
>>> batch_size = 2  # number of samples in the batch
>>> emissions = torch.randn(seq_length, batch_size, num_tags)
>>> tags = torch.tensor([
...   [0, 1], [2, 4], [3, 1]
... ], dtype=torch.long)  # (seq_length, batch_size)
>>> model(emissions, tags)
tensor(-12.7431, grad_fn=)

假如在你的输入张量中有一些填充,你可以传进去一个mask掩码张量。

>>> # mask size is (seq_length, batch_size)
>>> # the last sample has length of 1
>>> mask = torch.tensor([
...   [1, 1], [1, 1], [1, 0]
... ], dtype=torch.uint8)
>>> model(emissions, tags, mask=mask)
tensor(-10.8390, grad_fn=)

注意到这个返回值为对数似然,所以当你作为损失函数时,需要在这个值前添加负号。默认地,这个对数似然是批上的求和。对于其它的选项,你可以查询CRF.forward的API文档。

Decoding

为了获得最可能的句子标注序列,可以使用CRF.decode方法。

>>> model.decode(emissions)
[[3, 1, 3], [0, 1, 0]]

这个方法也接受一个mask掩码张量,详情可以查看CRF.decode。

NLP