Name: Modern Parallel Programming with C++ and Assembly Language: X86 Simd Development Using Avx, Avx2, and Avx-512
Brand: Apress
SKU: 317116710
Price: 45.77 GBP
Availability: InStock

Modern Parallel Programming with C++ and Assembly Language

X86 Simd Development Using Avx, Avx2, and Avx-512

By Kusswurm, Daniel

Rating

Format

Paperback, 633 pages

Published

United States, 1 March 2022

Learn the fundamentals of x86 Single instruction multiple data (SIMD) programming using C++ intrinsic functions and x86-64 assembly language. This book emphasizes x86 SIMD programming topics and technologies that are relevant to modern software development in applications which can exploit data level parallelism, important for the processing of big data, large batches of data and related important in data science and much more.

Modern Parallel Programming with C++ and Assembly Language is an instructional text that explains x86 SIMD programming using both C++ and assembly language. The book's content and organization are designed to help you quickly understand and exploit the SIMD capabilities of x86 processors. It also contains an abundance of source code that is structured to accelerate learning and comprehension of essential SIMD programming concepts and algorithms.

After reading this book, you will be able to code performance-optimized AVX, AVX2, and AVX-512 algorithms using either C++ intrinsic functions or x86-64 assembly language.

What You Will Learn

Understand the essential details about x86 SIMD architectures and instruction sets including AVX, AVX2, and AVX-512.

Master x86 SIMD data types, arithmetic instructions, and data management operations using both integer and floating-point operands.

Code performance-enhancing functions and algorithms that fully exploit the SIMD capabilities of a modern x86 processor.

Employ C++ intrinsic functions and x86-64 assembly language code to carry out arithmetic calculations using common programming constructs including arrays, matrices, and user-defined data structures.

Harness the x86 SIMD instruction sets to significantly accelerate the performance of computationally intense algorithms in applications such as machine learning, image processing, computer graphics, statistics, and matrix arithmetic.

Apply leading-edge coding strategies and techniques to optimally exploit the x86 SIMD instruction sets for maximum possible performance.

Who This Book Is For

Intermediate to advanced programmers/developers in general. Readers of this book should have previous programming experience with modern C++ (i.e., ANSI C++11 or later) and Assembly. Some familiarity with Microsoft's Visual Studio or the GNU toolchain will be helpful. The target audience for Modern X86 SIMD Programming are experienced software developers, programmers and maybe some hobbyists.

¿Daniel Kusswurm has over 35 years of professional experience as a software developer, computer scientist, and author. During his career, he has developed innovative software for medical devices, scientific instruments, and image processing applications. On many of these projects, he successfully employed C++ intrinsic functions, x86 assembly language, and SIMD programming techniques to significantly improve the performance of computationally intense algorithms or solve unique programming challenges. His educational background includes a BS in electrical engineering technology from Northern Illinois University along with an MS and PhD in computer science from DePaul University. Daniel Kusswurm is also the author of Modern X86 Assembly Language Programming (ISBN: 978-1484200650), Modern X86 Assembly Language Programming, Second Edition (ISBN: 978-1484240625), and Modern Arm Assembly Language Programming (ISBN: 978 1484262665), all published by Apress.

Modern X86 SIMD Programming - Outline Page 1 of 7

D. Kusswurm - F:ModX86SIMDOutlineModernX86SIMD_Outline (v1).docx

Introduction

The Introduction presents an overview of the book and includes concise descriptions of each chapter. It also summaries the

hardware and software tools required to use the book's source code.

Overview

Target Audience

Chapter Descriptions

Source Code

Additional Resources

Chapter 1 - SIMD Fundamentals

Chapter 1 discusses SIMD fundamentals including data types, basic arithmetic, and common data manipulation operations.

Understanding of this material is necessary for the reader to successfully comprehend the book's subsequent chapters.

What is SIMD?

Simple C++ example (Ch01_01)

Brief History of x86 SIMD Instruction Set Extensions

MMX

SSE - SSE4.2

AVX, AVX2, and AVX-512

SIMD Data Types

Fundamental types

128b, 256b, 512b

Integer types

Packed i8, i16, i32, i64 (signed and unsigned)

Floating-point types

Packed f16/b16, f32 and f64

Little-endian storage

SIMD Arithmetic

Integer

Addition and subtraction

Wraparound vs. saturated

Multiplication

Bitwise logical

Floating-point

Addition, subtraction, multiplication, division, sqrt

Horizontal addition and subtraction

Fused multiply-accumulate (FMA)

SIMD Operations

Integer

Min & max

Compares

Shuffles, permutations, and blends

Size promotions and reductions

Floating-point

Min & max

Compares

Shuffles, permutations, and blends

Size promotions and reductions

Modern X86 SIMD Programming - Outline Page 2 of 7

D. Kusswurm - F:ModX86SIMDOutlineModernX86SIMD_Outline (v1).docx

Masked moves

Conditional execution and merging (AVX-512)

SIMD Programming Overview

C++ compiler options

C++ SIMD intrinsic functions

Assembly language functions

Testing for AVX, AVX2, and AVX-512

Chapter 2 - AVX C++ Programming - Part 1

Chapter 2 teaches AVX integer arithmetic and other operations using C++ intrinsic functions. It also discusses how to code a

few simple image processing algorithms using C++ intrinsic functions and AVX instructions.

Basic Integer Arithmetic

Addition (Ch02_01)

Subtraction (Ch02_02)

Multiplication (Ch02_03)

Common Integer Operations

Bitwise logical operations (Ch02_04)

Arithmetic and logical shifts (Ch02_05)

Image Processing Algorithms

Pixel minimum and maximum (Ch02_06)

Pixel mean (Ch02_07)

Chapter 3 - AVX C++ Programming - Part 2

Chapter 3 is similar to the previous chapter but emphasizes floating-point instead of integer values. This chapter also

explains how to employ C++ intrinsic functions to perform SIMD arithmetic operations using floating-point arrays and

matrices.

Basic Floating-Point Arithmetic

Addition, subtraction, etc. (Ch03_01)

Compares (Ch03_02)

Conversions (Ch03_03)

Floating-Point Arrays

Array mean and standard deviation (Ch03_04, Ch03_05)

Array square roots and compares (Ch03_06, Ch03_07)

Floating-Point Matrices

Matrix column means (Ch03_08, Ch03_09)

Chapter 4 - AVX2 C++ Programming - Part 1

Chapter 4 describes AVX2 integer programming using C++ intrinsic functions. This chapter also highlights the coding of more

sophisticated image processing functions using the AVX2 instruction set.

Basic Integer Arithmetic

Addition and subtraction (Ch04_01)

Pack and unpack operations (Ch04_02)

Size promotions (Ch04_03)

Image Processing Algorithms

Pixel clipping (Ch04_04)

RGB to grayscale (Ch04_05)

Modern X86 SIMD Programming - Outline Page 3 of 7

D. Kusswurm - F:ModX86SIMDOutlineModernX86SIMD_Outline (v1).docx

Thresholding (Ch04_06)

Pixel conversions (Ch04_07)

Chapter 5 - AVX2 C++ Programming - Part 2

Chapter 5 explains how to accelerate the performance of commonly used floating-point algorithms using C++ intrinsic

functions and the AVX2 instruction set. The source code examples in this chapter also demonstrate use of FMA (fusedmultiply-add) arithmetic.

Floating-Point Arrays

Least squares with FMA (Ch05_01)

Floating-Point Matrices

Matrix multiplication (Ch05_02, Ch05_03)

Matrix (4x4) multiplication (Ch05_04, Ch05_05)

Matrix (4x4) vector multiplication (Ch05_06)

Matrix inversion (Ch05_07, Ch05_08)

Chapter 6 - AVX2 C++ Programming - Part 3

Chapter 6 is a continuation of the previous chapter. It focuses on more advanced algorithms and SIMD programming

techniques.

Signal Processing

Brief overview of convolution arithmetic

1D Convolutions

Variable and fixed width kernels (Ch06_01, Ch06_02)

2D Convolutions

Non-separable kernel (Ch06_03)

Separable kernel (Ch06_04)

Chapter 7 - AVX-512 C++ Programming - Part 1

Chapter 7 explains AVX-512 integer arithmetic and other operations using C++ intrinsic functions. It also discusses how to

code a few basic image processing algorithms using the AVX-512 instruction set.

Integer Arithmetic

Addition and subtraction (Ch07_01)

Masked arithmetic (Ch07_02)

Image Processing

RGB to grayscale (Ch07_03)

Image thresholding (Ch07_04)

Image statistics (Ch07_05)

Chapter 8 - AVX-512 C++ Programming - Part 2

Chapter 8 describes how to code common and advanced floating-point algorithms using C++ intrinsic functions and the AVX512 instruction set.

Floating-Point Arithmetic

Addition, subtraction, etc. (Ch08_01)

Masked operations (Ch08_02)

Floating-Point Arrays

Array mean and standard deviation (Ch08_03)

Modern X86 SIMD Programming - Outline Page 4 of 7

D. Kusswurm - F:ModX86SIMDOutlineModernX86SIMD_Outline (v1).docx

Floating-Point Matrices

Covariance matrix (Ch08_04)

Matrix multiplication (Ch08_05, Ch08_06)

Matrix (4x4) vector multiplication (Ch08_07)

Signal Processing

1D convolution using variable and fixed width kernels (Ch08_08)

2D convolutions using separable kernel (Ch08_09)

Chapter 9 - Supplemental C++ SIMD Programming

Chapter 9 examines supplemental x86 SIMD programming topics including instruction set detection, how to use SIMD math

library functions, and SIMD operations using text strings.

Instruction set detection (Ch09_01)

SIMD Math Library Functions

Rectangular to polar coordinate conversions (Ch09_02)

Body surface area calculations (Ch09_03)

SIMD String Operations

String length (Ch09_04)

Chapter 10 - X86 Processor Architecture

Chapter 10 explains x86 processor architecture including data types, register sets, memory addressing modes, and condition

codes. Knowledge of this material is necessary for the reader to successfully understand the subsequent x86 assembly

language programming chapters.

Data types

Fundamental data types

Numerical data types

SIMD data types

Strings

Internal architecture

General-purpose registers

RFLAGS register

MXCSR register

Scalar FP and SIMD registers

Memory addressing

Condition codes

Chapter 11 - Core Assembly Language Programming - Part 1

Chapter 11 teaches fundamental x86-64 assembly language programming and basic instruction use. Understanding of this

material is required to comprehend the source code examples in subsequent chapters.

Integer Arithmetic

Addition and subtraction (Ch11_01)

Multiplication (Ch11_02)

Division (Ch11_03)

Mixed integer types and stack arguments (Ch11_04)

Integer Operations

Memory addressing modes (Ch11_05)

Simple for-loops (Ch11_06)

Modern X86 SIMD Programming - Outline Page 5 of 7

D. Kusswurm - F:ModX86SIMDOutlineModernX86SIMD_Outline (v1).docx

Compares (Ch11_07)

Text Strings

String instructions (Ch11_08)

Chapter 12 - Core Assembly Language Programming - Part 2

Chapter 12 is a continuation of the previous chapter. Topics discussed include scalar floating-point arithmetic, floating-point

arrays, and function calling conventions.

Scalar Floating-Point Arithmetic

Single-precision arithmetic (Ch12_01)

Double-precision arithmetic (Ch12_02)

Compares (Ch12_03)

Conversions (Ch12_04)

Scalar Floating-Point Arrays

Mean, SD (Ch12_05)

Function Calling Convention

Stack frames (Ch12_06)

Using non-volatile general-purpose registers (Ch12_07)

Using non-volatile SIMD registers (Ch12_08)

Macros for function prologues and epilogues (Ch12_09)

Chapter 13 - AVX Assembly Language Programming - Part 1

Chapter 13 explains AVX integer arithmetic and other operations using x86-64 assembly language. It also describes how to

code a few simple image processing algorithms using assembly language.

Integer Arithmetic

Addition and subtraction (Ch13_01)

Multiplication (Ch13_02)

Common Integer Operations

Bitwise logical operations (Ch13_03)

Arithmetic and logical shifts (Ch13_04)

Image Processing Algorithms

Pixel minimum and maximum (Ch13_05)

Pixel mean (Ch13_06)

Chapter 14 - AVX Assembly Language Programming - Part 2

Chapter 14 is similar to the previous chapter but uses floating-point instead of integer values. This chapter also illustrates

how to employ x86-64 assembly language to perform SIMD arithmetic operations using arrays and matrices.

Basic Floating-Point Arithmetic

Addition and subtraction, etc. (Ch14_01)

Compares and size conversions (Ch14_02)

Floating-Point Arrays

Array mean and standard deviation (Ch14_03)

Array square roots and compares (Ch14_04)

Floating-Point Matrices

Matrix column means (Ch14_05)

Modern X86 SIMD Programming - Outline Page 6 of 7

D. Kusswurm - F:ModX86SIMDOutlineModernX86SIMD_Outline (v1).docx

Chapter 15 - AVX2 Assembly Language Programming - Part 1

Chapter 15 describes AVX2 integer programming using x86-64 assembly language. This chapter also highlights the coding of

more sophisticated image processing functions using the AVX2 instruction set.

Integer Arithmetic

Addition and subtraction (Ch15_01)

Image Processing

Pixel clipping (Ch15_02)

RGB to grayscale (Ch15_03)

Thresholding (Ch15_04)

Pixel conversions (Ch15_05)

Chapter 16 - AVX2 Assembly Language Programming - Part 2

Chapter 16 explains how to enhance the performance of frequently used floating-point algorithms using x86-64 assembly

language and the AVX2 instruction set.

Floating-Point Arrays

Least squares with FMA (Ch16_01)

Floating-Point Matrices

Matrix multiplication (Ch16_02)

Matrix (4x4) multiplication (Ch16_03)

Matrix (4x4) vector multiplication (Ch16_04)

Signal Processing

1D convolutions using fixed and variable width kernels (Ch16_05)

Chapter 17 - AVX-512 Assembly Language Programming - Part 1

Chapter 17 highlights AVX-512 integer arithmetic and other operations using x86-64 assembly language. It also discusses

how to code a few simple image processing algorithms using the AVX-512 instruction set.

Integer Arithmetic

Addition and subtraction (Ch17_01)

Compares, merge masking, and zero-masking (Ch17_02)

Image Processing

Pixel clipping (Ch17_03)

Image statistics (Ch17_04)

Chapter 18 - AVX-512 Assembly Language Programming - Part 2

Chapter 18 explains how to code common and advanced floating-point algorithms using x86-64 assembly language and the

and the AVX-512 instruction set.

Floating-Point Arrays

Correlation coefficient (Ch18_01)

Merge and zero masking (Ch18_02)

Embedded rounding and broadcasts (Ch18_03)

Floating-Point Matrices

Matrix (4x4) vector multiplication (Ch18_04)

Signal Processing

1D convolutions using fixed and variable width kernels (Ch18_05)

Modern X86 SIMD Programming - Outline Page 7 of 7

D. Kusswurm - F:ModX86SIMDOutlineModernX86SIMD_Outline (v1).docx

Appendix A - Source Code and Development Tools

Appendix A describes how to download, install, and execute the source code. It also includes some brief usage notes

regarding Visual Studio and the GNU C++ compiler.

Source Code Download Information

Software Development Tools

Microsoft Visual Studio

GNU C++ compiler

Appendix B - References and Additional Resources

Appendix B contains a list of references that were consulted during the writing of this book. It also lists supplemental

resources that the reader can consult for additional x86 SIMD programming information.

X86 SIMD Programming References

Algorithm References

C++ References

Additional Resources

Our Price

£45.77

Elsewhere

£54.99

Save £9.22 (17%)

Ships from USA Estimated delivery date: 9th Jun - 17th Jun from USA

Free Shipping Worldwide

Buy together with Modern X86 Assembly Language Programming at a great price!

Buy Together

£91.86

Elsewhere Price

£100.76

You Save £8.90 (9%)

Product Description

Modern X86 SIMD Programming - Outline Page 1 of 7

D. Kusswurm - F:ModX86SIMDOutlineModernX86SIMD_Outline (v1).docx

Introduction

The Introduction presents an overview of the book and includes concise descriptions of each chapter. It also summaries the

hardware and software tools required to use the book's source code.

Overview

Target Audience

Chapter Descriptions

Source Code

Additional Resources

Chapter 1 - SIMD Fundamentals

Chapter 1 discusses SIMD fundamentals including data types, basic arithmetic, and common data manipulation operations.

Understanding of this material is necessary for the reader to successfully comprehend the book's subsequent chapters.

What is SIMD?

Simple C++ example (Ch01_01)

Brief History of x86 SIMD Instruction Set Extensions

MMX

SSE - SSE4.2

AVX, AVX2, and AVX-512

SIMD Data Types

Fundamental types

128b, 256b, 512b

Integer types

Packed i8, i16, i32, i64 (signed and unsigned)

Floating-point types

Packed f16/b16, f32 and f64

Little-endian storage

SIMD Arithmetic

Integer

Addition and subtraction

Wraparound vs. saturated

Multiplication

Bitwise logical

Floating-point

Addition, subtraction, multiplication, division, sqrt

Horizontal addition and subtraction

Fused multiply-accumulate (FMA)

SIMD Operations

Integer

Min & max

Compares

Shuffles, permutations, and blends

Size promotions and reductions

Floating-point

Min & max

Compares

Shuffles, permutations, and blends

Size promotions and reductions

Modern X86 SIMD Programming - Outline Page 2 of 7

D. Kusswurm - F:ModX86SIMDOutlineModernX86SIMD_Outline (v1).docx

Masked moves

Conditional execution and merging (AVX-512)

SIMD Programming Overview

C++ compiler options

C++ SIMD intrinsic functions

Assembly language functions

Testing for AVX, AVX2, and AVX-512

Chapter 2 - AVX C++ Programming - Part 1

Chapter 2 teaches AVX integer arithmetic and other operations using C++ intrinsic functions. It also discusses how to code a

few simple image processing algorithms using C++ intrinsic functions and AVX instructions.

Basic Integer Arithmetic

Addition (Ch02_01)

Subtraction (Ch02_02)

Multiplication (Ch02_03)

Common Integer Operations

Bitwise logical operations (Ch02_04)

Arithmetic and logical shifts (Ch02_05)

Image Processing Algorithms

Pixel minimum and maximum (Ch02_06)

Pixel mean (Ch02_07)

Chapter 3 - AVX C++ Programming - Part 2

Chapter 3 is similar to the previous chapter but emphasizes floating-point instead of integer values. This chapter also

explains how to employ C++ intrinsic functions to perform SIMD arithmetic operations using floating-point arrays and

matrices.

Basic Floating-Point Arithmetic

Addition, subtraction, etc. (Ch03_01)

Compares (Ch03_02)

Conversions (Ch03_03)

Floating-Point Arrays

Array mean and standard deviation (Ch03_04, Ch03_05)

Array square roots and compares (Ch03_06, Ch03_07)

Floating-Point Matrices

Matrix column means (Ch03_08, Ch03_09)

Chapter 4 - AVX2 C++ Programming - Part 1

Chapter 4 describes AVX2 integer programming using C++ intrinsic functions. This chapter also highlights the coding of more

sophisticated image processing functions using the AVX2 instruction set.

Basic Integer Arithmetic

Addition and subtraction (Ch04_01)

Pack and unpack operations (Ch04_02)

Size promotions (Ch04_03)

Image Processing Algorithms

Pixel clipping (Ch04_04)

RGB to grayscale (Ch04_05)

Modern X86 SIMD Programming - Outline Page 3 of 7

D. Kusswurm - F:ModX86SIMDOutlineModernX86SIMD_Outline (v1).docx

Thresholding (Ch04_06)

Pixel conversions (Ch04_07)

Chapter 5 - AVX2 C++ Programming - Part 2

Chapter 5 explains how to accelerate the performance of commonly used floating-point algorithms using C++ intrinsic

functions and the AVX2 instruction set. The source code examples in this chapter also demonstrate use of FMA (fusedmultiply-add) arithmetic.

Floating-Point Arrays

Least squares with FMA (Ch05_01)

Floating-Point Matrices

Matrix multiplication (Ch05_02, Ch05_03)

Matrix (4x4) multiplication (Ch05_04, Ch05_05)

Matrix (4x4) vector multiplication (Ch05_06)

Matrix inversion (Ch05_07, Ch05_08)

Chapter 6 - AVX2 C++ Programming - Part 3

Chapter 6 is a continuation of the previous chapter. It focuses on more advanced algorithms and SIMD programming

techniques.

Signal Processing

Brief overview of convolution arithmetic

1D Convolutions

Variable and fixed width kernels (Ch06_01, Ch06_02)

2D Convolutions

Non-separable kernel (Ch06_03)

Separable kernel (Ch06_04)

Chapter 7 - AVX-512 C++ Programming - Part 1

Chapter 7 explains AVX-512 integer arithmetic and other operations using C++ intrinsic functions. It also discusses how to

code a few basic image processing algorithms using the AVX-512 instruction set.

Integer Arithmetic

Addition and subtraction (Ch07_01)

Masked arithmetic (Ch07_02)

Image Processing

RGB to grayscale (Ch07_03)

Image thresholding (Ch07_04)

Image statistics (Ch07_05)

Chapter 8 - AVX-512 C++ Programming - Part 2

Chapter 8 describes how to code common and advanced floating-point algorithms using C++ intrinsic functions and the AVX512 instruction set.

Floating-Point Arithmetic

Addition, subtraction, etc. (Ch08_01)

Masked operations (Ch08_02)

Floating-Point Arrays

Array mean and standard deviation (Ch08_03)

Modern X86 SIMD Programming - Outline Page 4 of 7

D. Kusswurm - F:ModX86SIMDOutlineModernX86SIMD_Outline (v1).docx

Floating-Point Matrices

Covariance matrix (Ch08_04)

Matrix multiplication (Ch08_05, Ch08_06)

Matrix (4x4) vector multiplication (Ch08_07)

Signal Processing

1D convolution using variable and fixed width kernels (Ch08_08)

2D convolutions using separable kernel (Ch08_09)

Chapter 9 - Supplemental C++ SIMD Programming

Chapter 9 examines supplemental x86 SIMD programming topics including instruction set detection, how to use SIMD math

library functions, and SIMD operations using text strings.

Instruction set detection (Ch09_01)

SIMD Math Library Functions

Rectangular to polar coordinate conversions (Ch09_02)

Body surface area calculations (Ch09_03)

SIMD String Operations

String length (Ch09_04)

Chapter 10 - X86 Processor Architecture

Chapter 10 explains x86 processor architecture including data types, register sets, memory addressing modes, and condition

codes. Knowledge of this material is necessary for the reader to successfully understand the subsequent x86 assembly

language programming chapters.

Data types

Fundamental data types

Numerical data types

SIMD data types

Strings

Internal architecture

General-purpose registers

RFLAGS register

MXCSR register

Scalar FP and SIMD registers

Memory addressing

Condition codes

Chapter 11 - Core Assembly Language Programming - Part 1

Chapter 11 teaches fundamental x86-64 assembly language programming and basic instruction use. Understanding of this

material is required to comprehend the source code examples in subsequent chapters.

Integer Arithmetic

Addition and subtraction (Ch11_01)

Multiplication (Ch11_02)

Division (Ch11_03)

Mixed integer types and stack arguments (Ch11_04)

Integer Operations

Memory addressing modes (Ch11_05)

Simple for-loops (Ch11_06)

Modern X86 SIMD Programming - Outline Page 5 of 7

D. Kusswurm - F:ModX86SIMDOutlineModernX86SIMD_Outline (v1).docx

Compares (Ch11_07)

Text Strings

String instructions (Ch11_08)

Chapter 12 - Core Assembly Language Programming - Part 2

Chapter 12 is a continuation of the previous chapter. Topics discussed include scalar floating-point arithmetic, floating-point

arrays, and function calling conventions.

Scalar Floating-Point Arithmetic

Single-precision arithmetic (Ch12_01)

Double-precision arithmetic (Ch12_02)

Compares (Ch12_03)

Conversions (Ch12_04)

Scalar Floating-Point Arrays

Mean, SD (Ch12_05)

Function Calling Convention

Stack frames (Ch12_06)

Using non-volatile general-purpose registers (Ch12_07)

Using non-volatile SIMD registers (Ch12_08)

Macros for function prologues and epilogues (Ch12_09)

Chapter 13 - AVX Assembly Language Programming - Part 1

Chapter 13 explains AVX integer arithmetic and other operations using x86-64 assembly language. It also describes how to

code a few simple image processing algorithms using assembly language.

Integer Arithmetic

Addition and subtraction (Ch13_01)

Multiplication (Ch13_02)

Common Integer Operations

Bitwise logical operations (Ch13_03)

Arithmetic and logical shifts (Ch13_04)

Image Processing Algorithms

Pixel minimum and maximum (Ch13_05)

Pixel mean (Ch13_06)

Chapter 14 - AVX Assembly Language Programming - Part 2

Chapter 14 is similar to the previous chapter but uses floating-point instead of integer values. This chapter also illustrates

how to employ x86-64 assembly language to perform SIMD arithmetic operations using arrays and matrices.

Basic Floating-Point Arithmetic

Addition and subtraction, etc. (Ch14_01)

Compares and size conversions (Ch14_02)

Floating-Point Arrays

Array mean and standard deviation (Ch14_03)

Array square roots and compares (Ch14_04)

Floating-Point Matrices

Matrix column means (Ch14_05)

Modern X86 SIMD Programming - Outline Page 6 of 7

D. Kusswurm - F:ModX86SIMDOutlineModernX86SIMD_Outline (v1).docx

Chapter 15 - AVX2 Assembly Language Programming - Part 1

Chapter 15 describes AVX2 integer programming using x86-64 assembly language. This chapter also highlights the coding of

more sophisticated image processing functions using the AVX2 instruction set.

Integer Arithmetic

Addition and subtraction (Ch15_01)

Image Processing

Pixel clipping (Ch15_02)

RGB to grayscale (Ch15_03)

Thresholding (Ch15_04)

Pixel conversions (Ch15_05)

Chapter 16 - AVX2 Assembly Language Programming - Part 2

Chapter 16 explains how to enhance the performance of frequently used floating-point algorithms using x86-64 assembly

language and the AVX2 instruction set.

Floating-Point Arrays

Least squares with FMA (Ch16_01)

Floating-Point Matrices

Matrix multiplication (Ch16_02)

Matrix (4x4) multiplication (Ch16_03)

Matrix (4x4) vector multiplication (Ch16_04)

Signal Processing

1D convolutions using fixed and variable width kernels (Ch16_05)

Chapter 17 - AVX-512 Assembly Language Programming - Part 1

Chapter 17 highlights AVX-512 integer arithmetic and other operations using x86-64 assembly language. It also discusses

how to code a few simple image processing algorithms using the AVX-512 instruction set.

Integer Arithmetic

Addition and subtraction (Ch17_01)

Compares, merge masking, and zero-masking (Ch17_02)

Image Processing

Pixel clipping (Ch17_03)

Image statistics (Ch17_04)

Chapter 18 - AVX-512 Assembly Language Programming - Part 2

Chapter 18 explains how to code common and advanced floating-point algorithms using x86-64 assembly language and the

and the AVX-512 instruction set.

Floating-Point Arrays

Correlation coefficient (Ch18_01)

Merge and zero masking (Ch18_02)

Embedded rounding and broadcasts (Ch18_03)

Floating-Point Matrices

Matrix (4x4) vector multiplication (Ch18_04)

Signal Processing

1D convolutions using fixed and variable width kernels (Ch18_05)

Modern X86 SIMD Programming - Outline Page 7 of 7

D. Kusswurm - F:ModX86SIMDOutlineModernX86SIMD_Outline (v1).docx

Appendix A - Source Code and Development Tools

Appendix A describes how to download, install, and execute the source code. It also includes some brief usage notes

regarding Visual Studio and the GNU C++ compiler.

Source Code Download Information

Software Development Tools

Microsoft Visual Studio

GNU C++ compiler

Appendix B - References and Additional Resources

Appendix B contains a list of references that were consulted during the writing of this book. It also lists supplemental

resources that the reader can consult for additional x86 SIMD programming information.

X86 SIMD Programming References

Algorithm References

C++ References

Additional Resources

Table of Contents

About the Author