Front Page All Articles Recent Changes Random Article

Contents

Concatenative language

  • ACL
  • Ait
  • Aocla
  • Breeze
  • Callisto
  • Cat
  • Cognate
  • colorForth
  • Concata
  • CoSy
  • Deque
  • DSSP
  • dt
  • Elymas
  • Enchilada
  • ETAC
  • F
  • Factor
  • Fiveth
  • Forth
  • Fourth
  • Freelang
  • Gershwin
  • hex
  • iNet
  • Joy
  • Joy of Postfix App
  • kcats
  • Kitten
  • lang5
  • Listack
  • LSE64
  • Lviv
  • Meow5
  • min
  • Mirth
  • mjoy
  • Mlatu
  • Ode
  • OForth
  • Om
  • Onyx
  • Plorth
  • Popr
  • Porth
  • PostScript
  • Prowl
  • Quest32
  • Quackery
  • r3
  • Raven
  • Retro
  • RPL
  • SPL
  • Staapl
  • Stabel
  • Tal
  • Titan
  • Trith
  • Uiua
  • Worst
  • xs
  • XY
  • 5th
  • 8th

Concatenative topics

  • Compilers
  • Interpreters
  • Type systems
  • Object systems
  • Quotations
  • Variables
  • Garbage collection
  • Example programs

Concatenative meta

  • People
  • Communities

Other languages

  • APL
  • C++
  • Erlang
  • FP trivia
  • Haskell
  • Io
  • Java
  • JavaScript
  • Lisp
  • ML
  • Oberon
  • RPL
  • Self
  • Slate
  • Smalltalk

Meta

  • Search
  • Farkup wiki format
  • Etiquette
  • Sandbox

SSE

This is a quick reference for Intel's Streaming SIMD Extensions.

Vector types

The vector types here are named with the same convention as in Factor's SIMD library. It should be obvious what they mean:

  • char-16
  • uchar-16
  • short-8
  • ushort-8
  • int-4
  • uint-4
  • float-4
  • double-2

Instruction set

The number next to each instruction is the SSE version:

  • 1: SSE
  • 2: SSE2
  • 3: SSE3
  • 3.3: SSSE3
  • 4.1: SSE4.1
  • 4.2: SSE4.2
char-16 uchar-16short-8 ushort-8int-4 uint-4 float-4 double-2
move MOVQ 2 MOVQ 2 MOVQ 2 MOVQ 2 MOVQ 2 MOVQ 2 MOVPS 1 MOVPD 2
add PADDB 2 PADDB 2 PADDW 2 PADDW 2 PADDD 2 PADDD 2 ADDPS 1 ADDPD 2
subtractPSUBB 2 PSUBB 2 PSUBW 2 PSUBW 2 PSUBD 2 PSUBD 2 SUBPS 1 SUBPD 2
add with saturation PADDSB 2 PADDUSB 2 PADDSW 2 PADDUSW 2
subtract with saturation PSUBSB 2 PSUBUSB 2 PSUBSW 2 PSUBUSW 2
add-subtract ADDSUBPS 3 ADDSUBPD 3
horizontal addPHADDW 3.3PHADDW 3.3PHADDD 3.3PHADDW 3.3HADDPS 3HADDPS 3
multiply PMULLW 2 PMULLW 2 PMULLD 2 PMULLD 2 MULPS 1 MULPD 2
divide DIVPS 1 DIVPD 2
absolute value PABSB 3.3 PABSW 3.3 PABSD 3.3
minimum PMINUB 2 PMINSW 2 MINPS 1 MINPD 2
maximum PMAXUB 2 PMAXSW 2 MAXPS 1 MAXPD 2
approximate reciprocalRCPSS 1RCPSD 2
square rootSQRTSS 1SQRTSD 2

Idioms

int-4

Gather four integers into a vector

float-4

Gather four floats into a vector

Broadcast float into four components

double-2

Gather two doubles into a vector

Broadcast double into two components

Horizontal add with SSE2

This revision created on Wed, 23 Sep 2009 00:19:29 by slava

Latest Revisions Edit

All content is © 2008-2024 by its respective authors. By adding content to this wiki, you agree to release it under the BSD license.