译(五十六)-Pytorch梯度剪裁

stackoverflow热门问题目录

如有翻译问题欢迎评论指出,谢谢。

PyTorch如何实现梯度剪裁?

  • Gulzar asked:

    • 怎么用 PyTorch 实现梯度剪裁?
    • 我碰到了梯度爆炸的问题。
  • Answers:

    • Rahul – vote: 143

    • 更完整的示例见 这里

    • optimizer.zero_grad()        
      loss, hidden = model(data, hidden, targets)
      loss.backward()
      
      torch.nn.utils.clip_grad_norm_(model.parameters(), args.clip)
      optimizer.step()
    • Charles Xu – vote: 0

    • 我碰到了相同的错误,我想剪裁正则但是依然是nan
      译者注:答主在评论区提到 doesn’t work 是指 still gives a ‘nan’

    • 我不想改变改动网络或者增添正则化,之后我尝试将优化器改为 Adam,问题解决了。

    • 具体来说,是使用 Adam 的预训练模型来初始化训练,并使用 SGD 和 momentum 来微调

    • hkchengrex – vote: 3

    • 如果用的是 AMP,剪裁前还需要一些步骤:

    • optimizer.zero_grad()
      loss, hidden = model(data, hidden, targets)
      self.scaler.scale(loss).backward()
      
      # Unscales the gradients of optimizer's assigned params in-place
      self.scaler.unscale_(optimizer)
      
      # Since the gradients of optimizer's assigned params are unscaled, clips as usual:
      torch.nn.utils.clip_grad_norm_(model.parameters(), max_norm)
      
      # optimizer's gradients are already unscaled, so scaler.step does not unscale them,
      # although it still skips optimizer.step() if the gradients contain infs or NaNs.
      scaler.step(optimizer)
      
      # Updates the scale for next iteration.
      scaler.update()
    • 参考: https://pytorch.org/docs/stable/notes/amp_examples.html#gradient-clipping


How to do gradient clipping in pytorch?

  • Gulzar asked:

    • What is the correct way to perform gradient clipping in pytorch?
      怎么用 PyTorch 实现梯度剪裁?
    • I have an exploding gradients problem.
      我碰到了梯度爆炸的问题。
  • Answers:

    • Rahul – vote: 143

    • A more complete example from here:
      更完整的示例见 这里

    • optimizer.zero_grad()        
      loss, hidden = model(data, hidden, targets)
      loss.backward()
      
      torch.nn.utils.clip_grad_norm_(model.parameters(), args.clip)
      optimizer.step()
    • Charles Xu – vote: 0

    • Well, I met with same err. I tried to use the clip norm but it doesn\’t work.
      我碰到了相同的错误,我想剪裁正则但是依然是nan
      译者注:答主在评论区提到 doesn’t work 是指 still gives a ‘nan’

    • I don\’t want to change the network or add regularizers. So I change the optimizer to Adam, and it works.
      我不想改变改动网络或者增添正则化,之后我尝试将优化器改为 Adam,问题解决了。

    • Then I use the pretrained model from Adam to initate the training and use SGD + momentum for fine tuning. It is now working.
      具体来说,是使用 Adam 的预训练模型来初始化训练,并使用 SGD 和 momentum 来微调

    • hkchengrex – vote: 3

    • And if you are using Automatic Mixed Precision (AMP), you need to do a bit more before clipping:
      如果用的是 AMP,剪裁前还需要一些步骤:

    • optimizer.zero_grad()
      loss, hidden = model(data, hidden, targets)
      self.scaler.scale(loss).backward()
      
      # Unscales the gradients of optimizer's assigned params in-place
      self.scaler.unscale_(optimizer)
      
      # Since the gradients of optimizer's assigned params are unscaled, clips as usual:
      torch.nn.utils.clip_grad_norm_(model.parameters(), max_norm)
      
      # optimizer's gradients are already unscaled, so scaler.step does not unscale them,
      # although it still skips optimizer.step() if the gradients contain infs or NaNs.
      scaler.step(optimizer)
      
      # Updates the scale for next iteration.
      scaler.update()
    • Reference: https://pytorch.org/docs/stable/notes/amp_examples.html#gradient-clipping
      参考: [https://pytorch.org/docs/stable/notes/amp_examples.html#gradient-clipping](

You may also like...

发表评论

您的电子邮箱地址不会被公开。

82 − 75 =