a ¢º”h+ã @s`UddlZddlZddlZddlZddlmZmZmZddlmZddl Z ddl m Z ddlmZm Z mZgZeeed<ee j eje j fZdd„Zedeeeeee j dœdd „ƒZedeee j eeddœdd„ƒZedeeeeeee j dœdd„ƒZededdeeeeeee j dœdd„ƒZed eeeeddœdd„ƒZde_de_de_dS)!éN)ÚcastÚOptionalÚUnion)Ú deprecated)ÚTensor)Ú_device_has_foreach_supportÚ"_group_tensors_by_device_and_dtypeÚ_has_foreach_supportÚ__all__cs‡fdd„}t |ˆ¡|S)z£ This wrapper is needed to avoid a circular import when using @torch.no_grad on the exposed functions clip_grad_norm_ and clip_grad_value_ themselves. cs:t ¡ˆ|i|¤ŽWdƒS1s,0YdS©N)ÚtorchZno_grad)ÚargsÚkwargs©Úfunc©úF/var/www/auris/lib/python3.9/site-packages/torch/nn/utils/clip_grad.pyÚ_no_grad_wrapper"s z"_no_grad.._no_grad_wrapper)Ú functoolsÚupdate_wrapper)rrrrrÚ_no_gradsrç@F)ÚtensorsÚ norm_typeÚerror_if_nonfiniteÚforeachÚreturnc s"t|tjƒr|g}nt|ƒ}tˆƒ‰t|ƒdkr:t d¡S|dj‰t|gƒ}g}| ¡D]t\\}}\\}}|dur~t ||ƒsŠ|ržt|ƒrž| t |ˆ¡¡qZ|r¶td|j›dƒ‚qZ| ‡fdd„|Dƒ¡qZtj t ‡fdd„|Dƒ¡ˆ¡} |rt | ¡| ¡¡rtd ˆ›d ƒ‚| S)aÄCompute the norm of an iterable of tensors. The norm is computed over the norms of the individual tensors, as if the norms of the individual tensors were concatenated into a single vector. Args: tensors (Iterable[Tensor] or Tensor): an iterable of Tensors or a single Tensor that will be normalized norm_type (float): type of the used p-norm. Can be ``'inf'`` for infinity norm. error_if_nonfinite (bool): if True, an error is thrown if the total norm of :attr:`tensors` is ``nan``, ``inf``, or ``-inf``. Default: ``False`` foreach (bool): use the faster foreach-based implementation. If ``None``, use the foreach implementation for CUDA and CPU native tensors and silently fall back to the slow implementation for other device types. Default: ``None`` Returns: Total norm of the tensors (viewed as a single vector). rgNú:foreach=True was passed, but can't use the foreach API on ú tensorscsg|]}tj |ˆ¡‘qSr)rÚlinalgÚvector_norm)Ú.0Úg)rrrÚ `óz#_get_total_norm..csg|]}| ˆ¡‘qSr)Úto)r!Znorm)Úfirst_devicerrr#dr$zThe total norm of order z´ for gradients from `parameters` is non-finite, so it cannot be clipped. To disable this error and scale the gradients by the non-finite norm anyway, set `error_if_nonfinite=False`)Ú isinstancerrÚlistÚfloatÚlenZtensorÚdevicerÚitemsr rÚextendZ _foreach_normÚRuntimeErrorÚtyperr ÚstackÚ logical_orÚisnanÚisinf) rrrrZgrouped_tensorsZnormsr+Ú_Zdevice_tensorsÚ total_normr)r&rrÚ_get_total_norm*sD ÿþÿÿÿÿÿ ÿr6)Ú parametersÚmax_normr5rrc Csät|tjƒr|g}dd„|Dƒ}t|ƒ}t|ƒdkr8dSt|gƒ}||d}tj|dd}| ¡D]z\\}} \\} } |durˆt| |ƒs”|r¨t |ƒr¨t | | |¡¡qd|rÀtd|j ›d ƒ‚qd| |¡}| D]}| |¡qÎqddS) aÐScale the gradients of an iterable of parameters given a pre-calculated total norm and desired max norm. The gradients will be scaled by the following calculation .. math:: grad = grad * \frac{max\_norm}{total\_norm + 1e-6} Gradients are modified in-place. This function is equivalent to :func:`torch.nn.utils.clip_grad_norm_` with a pre-calculated total norm. Args: parameters (Iterable[Tensor] or Tensor): an iterable of Tensors or a single Tensor that will have gradients normalized max_norm (float): max norm of the gradients total_norm (Tensor): total norm of the gradients to use for clipping foreach (bool): use the faster foreach-based implementation. If ``None``, use the foreach implementation for CUDA and CPU native tensors and silently fall back to the slow implementation for other device types. Default: ``None`` Returns: None cSsg|]}|jdur|j‘qSr©Úgrad©r!Úprrrr#“r$z*_clip_grads_with_norm_..rNgíµ ÷Æ°>gð?)Úmaxrr)r'rrr)r*rÚclampr,r rZ _foreach_mul_r%r.r/Zmul_) r7r8r5rÚgradsÚ grouped_gradsZ clip_coefZclip_coef_clampedr+r4Zdevice_gradsZclip_coef_clamped_devicer"rrrÚ_clip_grads_with_norm_qs0 þÿÿÿ rA)r7r8rrrrcCstt|tjƒr|g}n2t|tjƒ}t|ƒ}|rFt|ƒdkrFtjddddd„|Dƒ}t ||||ƒ}t ||||ƒ|S)aSClip the gradient norm of an iterable of parameters. The norm is computed over the norms of the individual gradients of all parameters, as if the norms of the individual gradients were concatenated into a single vector. Gradients are modified in-place. This function is equivalent to :func:`torch.nn.utils.get_total_norm` followed by :func:`torch.nn.utils.clip_grads_with_norm_` with the ``total_norm`` returned by ``get_total_norm``. Args: parameters (Iterable[Tensor] or Tensor): an iterable of Tensors or a single Tensor that will have gradients normalized max_norm (float): max norm of the gradients norm_type (float, optional): type of the used p-norm. Can be ``'inf'`` for infinity norm. Default: 2.0 error_if_nonfinite (bool, optional): if True, an error is thrown if the total norm of the gradients from :attr:`parameters` is ``nan``, ``inf``, or ``-inf``. Default: False foreach (bool, optional): use the faster foreach-based implementation. If ``None``, use the foreach implementation for CUDA and CPU native tensors and silently fall back to the slow implementation for other device types. Default: ``None`` Returns: Total norm of the parameter gradients (viewed as a single vector). rzD`parameters` is an empty generator, no gradient clipping will occur.é)Ú stacklevelcSsg|]}|jdur|j‘qSrr9r;rrrr#Ür$z#clip_grad_norm_..)r'rrÚtypesÚ GeneratorTyper(r*ÚwarningsÚwarnr6rA)r7r8rrrZis_generatorr?r5rrrÚclip_grad_norm_¯s"þrHz_`torch.nn.utils.clip_grad_norm` is now deprecated in favor of `torch.nn.utils.clip_grad_norm_`.)ÚcategorycCst|||||ƒS)zClip the gradient norm of an iterable of parameters. .. warning:: This method is now deprecated in favor of :func:`torch.nn.utils.clip_grad_norm_`. )rH)r7r8rrrrrrÚclip_grad_normâsrJ)r7Ú clip_valuerrcCsät|tjƒr|g}t|ƒ}dd„|Dƒ}t|gƒ}| ¡D]¤\\}}\\}}|durjtttt|ƒ|dsv|r¦t |ƒr¦t ttt|ƒ|¡t ttt|ƒ|¡q:|r¾td|j ›dƒ‚q:|D]}tt|ƒj||dqÂq:dS)aÞClip the gradients of an iterable of parameters at specified value. Gradients are modified in-place. Args: parameters (Iterable[Tensor] or Tensor): an iterable of Tensors or a single Tensor that will have gradients normalized clip_value (float): maximum allowed value of the gradients. The gradients are clipped in the range :math:`\left[\text{-clip\_value}, \text{clip\_value}\right]` foreach (bool, optional): use the faster foreach-based implementation If ``None``, use the foreach implementation for CUDA and CPU native tensors and silently fall back to the slow implementation for other device types. Default: ``None`` cSsg|]}|jdur|j‘qSrr9r;rrrr#r$z$clip_grad_value_..N)r+rr)Úminr=)r'rrr)rr,r rr(rZ_foreach_clamp_min_Z_foreach_clamp_max_r.r/Zclamp_)r7rKrr?r@r+r4r:rrrÚclip_grad_value_÷s, ÿþýýÿrMztorch.nn.utils)rFN)N)rFN)rFN)N)rrDÚtypingrFrrrZtyping_extensionsrrrZtorch.utils._foreach_utilsrrr r r(ÚstrÚ__annotations__ÚIterableZ_tensor_or_tensorsrr)Úboolr6rArHÚ FutureWarningrJrMÚ __module__rrrrÚs’ ÿÿüûFüû=ûú2ýûúýü+