Tensor cores mesmo, diz explicitamente na própria patente.
"In certain example embodiments, the techniques herein may advantageously take advantage of NVIDIA's tensor cores (or other similar hardware). A tensor core may be a hardware unit that multiplies two 16×16 FP16 matrices (or other sized matrices depending on the nature of the hardware), and then adds a third FP16 matrix to the result by using fused multiply—add operations, and obtains an FP16 result. In certain example embodiments, a tensor core (or other processing hardware) can be used to multiply two 16×16 INT8 matrices (or other sized matrices depending on the nature of the hardware), and then add a third INT32 matrix to the result by using fused multiply-add operations and obtain an INT32 result which can then be converted to INT8 by dividing by the appropriate normalization amount (e.g., which may be calculated during a training process, such as described in connection with FIG. 9). Such conversions may be accomplished using, for example, a low processing cost integer right shift. Such hardware acceleration for the processing discussed herein (e.g., in the context the separable block transforms) may be advantageous."
Yep! Muito provavelmente houve uma mudança de planos nesse meio tempo e adiaram o bagulho para sabe-se lá quando.
"In certain example embodiments, the techniques herein may advantageously take advantage of NVIDIA's tensor cores (or other similar hardware). A tensor core may be a hardware unit that multiplies two 16×16 FP16 matrices (or other sized matrices depending on the nature of the hardware), and then adds a third FP16 matrix to the result by using fused multiply—add operations, and obtains an FP16 result. In certain example embodiments, a tensor core (or other processing hardware) can be used to multiply two 16×16 INT8 matrices (or other sized matrices depending on the nature of the hardware), and then add a third INT32 matrix to the result by using fused multiply-add operations and obtain an INT32 result which can then be converted to INT8 by dividing by the appropriate normalization amount (e.g., which may be calculated during a training process, such as described in connection with FIG. 9). Such conversions may be accomplished using, for example, a low processing cost integer right shift. Such hardware acceleration for the processing discussed herein (e.g., in the context the separable block transforms) may be advantageous."
Chozo Master escreveu: (01-10-2021, 01:18 PM)Basicamente tensor cores.
Bate com rumores anteriores.
Provavelmente era pra ter vindo junto com OLED, mas adiaram devido ao chip shortage.
Yep! Muito provavelmente houve uma mudança de planos nesse meio tempo e adiaram o bagulho para sabe-se lá quando.