pytorch-crf使用小结
- pytorch-crf包API
- 例子
- Getting started
- Computing log likelihood
- Decoding
pytorch-crf包提供了一个CRF层的PyTorch版本实现,我们在做NER任务时可以很方便地利用这个库,而不必自己单独去实现。
pytorch-crf包API
class torchcrf.CRF(num_tags, batch_first=False)
This module implements a conditional random field.
- The forward computation of this class computes the log likelihood of the given sequence of tags and emission score tensor.
- This class also has decode method which finds the best tag sequence given an emission score tensor using Viterbi algorithm.
-
Parameters:
- num_tags (int) – Number of tags.
- batch_first (bool) – Whether the first dimension corresponds to the size of a minibatch.
-
start_transitions:Start transition score tensor of size
(num_tags,)
.
Type:Parameter -
end_transitions:End transition score tensor of size
(num_tags,)
.
Type:Parameter -
transitions:Transition score tensor of size
(num_tags, num_tags)
.
Type:Parameter -
decode(emissions, mask=None):Find the most likely tag sequence using Viterbi algorithm.
- Parameters:
- emissions (Tensor) – Emission score tensor of size
(seq_length, batch_size, num_tags)
ifbatch_first
is False,(batch_size, seq_length, num_tags)
otherwise. - mask (ByteTensor) – Mask tensor of size
(seq_length, batch_size)
ifbatch_first
is False,(batch_size, seq_length)
otherwise.
- emissions (Tensor) – Emission score tensor of size
- Return type:
List[List[int]]
- Returns:List of list containing the best tag sequence for each batch.
- Parameters:
-
forward(emissions, tags, mask=None, reduction='sum'):Compute the conditional log likelihood of a sequence of tags given emission scores.
- Parameters:
- emissions (Tensor) – Emission score tensor of size
(seq_length, batch_size, num_tags)
ifbatch_first
is False,(batch_size, seq_length, num_tags)
otherwise. - tags (LongTensor) – Sequence of tags tensor of size
(seq_length, batch_size)
ifbatch_first
is False,(batch_size, seq_length)
otherwise. - mask (ByteTensor) – Mask tensor of size
(seq_length, batch_size)
ifbatch_first
is False,(batch_size, seq_length)
otherwise. - reduction (str) – Specifies the reduction to apply to the output: none|sum|mean|token_mean. none: no reduction will be applied. sum: the output will be summed over batches. mean: the output will be averaged over batches. token_mean: the output will be averaged over tokens.
- emissions (Tensor) – Emission score tensor of size
- Returns:The log likelihood. This will have size
(batch_size,)
if reduction is none, () otherwise. - Return type:Tensor
- Parameters:
-
reset_parameters():Initialize the transition parameters.The parameters will be initialized randomly from a uniform distribution between -0.1 and 0.1.
- Return type:None
例子
Getting started
pytorch-crf中的CRF类继承自PyTorch的nn.Module,这个类提供了一个CRF层的实现。
>>> import torch
>>> from torchcrf import CRF
>>> num_tags = 5 # number of tags is 5
>>> model = CRF(num_tags)
Computing log likelihood
一旦创建了CRF类,我们可以计算在给定mission scores的情况下,一个标注序列的对数似然。
>>> seq_length = 3 # maximum sequence length in a batch
>>> batch_size = 2 # number of samples in the batch
>>> emissions = torch.randn(seq_length, batch_size, num_tags)
>>> tags = torch.tensor([
... [0, 1], [2, 4], [3, 1]
... ], dtype=torch.long) # (seq_length, batch_size)
>>> model(emissions, tags)
tensor(-12.7431, grad_fn=)
假如在你的输入张量中有一些填充,你可以传进去一个mask掩码张量。
>>> # mask size is (seq_length, batch_size)
>>> # the last sample has length of 1
>>> mask = torch.tensor([
... [1, 1], [1, 1], [1, 0]
... ], dtype=torch.uint8)
>>> model(emissions, tags, mask=mask)
tensor(-10.8390, grad_fn=)
注意到这个返回值为对数似然,所以当你作为损失函数时,需要在这个值前添加负号。默认地,这个对数似然是批上的求和。对于其它的选项,你可以查询CRF.forward的API文档。
Decoding
为了获得最可能的句子标注序列,可以使用CRF.decode方法。
>>> model.decode(emissions)
[[3, 1, 3], [0, 1, 0]]
这个方法也接受一个mask掩码张量,详情可以查看CRF.decode。