a ¡º”h÷ã@s–UdZddlZddlmZddlZgZeeed<ej e dœdd„Zej eej dœd d „Z eeeedœdd „Zdej ej ej eej dœdd„ZdS)zCDefines utilities for interacting with scaled_dot_product_attentionéN)ÚOptionalÚ__all__)ÚtensorsÚreturncGstdd„|DƒƒS)z0Returns True if any of the tensors requires gradcss|]}|jVqdS)N)Z requires_grad)Ú.0Út©rúG/var/www/auris/lib/python3.9/site-packages/torch/nn/attention/_utils.pyÚ óz'_input_requires_grad..)Úany)rrrr Ú_input_requires_grad sr )Úinpt_tensorÚog_sizercCs"| d¡|kr|dd|…fS|S)z'Handles the unpad of the last dimensionéÿÿÿÿ.N)Úsize)rrrrr Ú_postprocess_flash_outputsr)Ú head_dim_sizeÚscalercCs|dur|Sdt |¡S)z For FlashAttention we pad the head dimension to be a multiple of 8 so we need to scale the output by the original head size and not the padded. Ngð?)ÚmathÚsqrt)rrrrr Ú_calculate_scalesrçF)ÚqueryÚkeyÚvalueÚ attn_maskcCsÄ|j|jks|j|jkr:td|j›d|j›d|j›dƒ‚|j|jksR|j|jkrttd|j›d|j›d|j›dƒ‚| ¡dks˜| ¡dks˜| ¡dkrÀtd | ¡›d | ¡›d| ¡›dƒ‚dS)NzLExpected query, key, and value to have the same dtype, but got query.dtype: z , key.dtype: z, and value.dtype: z instead.zSExpected query, key, and value to have the same device type, but got query.device: z, key.device: z, and value.device: ézUExpected query, key, and value to all be at least 2 dimensional, but got query.dim: z, key.dim: z and value.dim: )ZdtypeÚ ValueErrorZdeviceZdim)rrrrZ dropout_pZ is_causalrrrr Ú_validate_sdpa_input#s< ÿÿþÿÿÿþÿ$ÿÿÿÿr)NrFN)Ú__doc__rÚtypingrZtorchrÚlistÚstrÚ__annotations__ZTensorÚboolr ÚintrÚfloatrrrrrr Ús"ùü