An assembly (or assembler) language, often abbreviated asm, is a type of low-level programming language for a computer, or other programmable device, in which there is a very strong (but often not one-to-one) correspondence between the language and the architecture’s machine code instructions.
Each assembly language is specific to a particular computer architecture. In contrast, most high-level programming languages are generally portable across multiple architectures but require interpreting or compiling.
This Guidebook is for x86 architecture, or, 8086 microprocessor assembly language programming.
ABSOLUTELY.
8086 is a 16-bit microprocessor. It has 20 address pins (16 of them multiplexed with the data pins) It can address a maximum of 2^20 = 1 million locations Memory is byte addressabe. Every byte has a separate addres.
Registers are memory devices used to store some data.
The 8086 microprocessorr has a total of 14 registers that are accessible to the programmer.
8 of these registers are known as general purpose registers,i.e. they can be used by a proogrammer for data manipulation.
Each of the registers is 16-bit long,i.e. can carry a 16-bit binary number.
The first 4 are reffered to as data registers, they are represented by AX, BX, CX, DX. The second 4 are referred tp as index/pointer registers. They are SP, BP, SI and DI registers.
8086 CPU has 8 general purpose registers.General purpose registers are available to store any transient data required by the program.
For example, when a program is interrupted its state, ie: the value of the registers such as the program counter, instruction register or memory address register - may be saved into the general purpose registers, ready for recall when the program is ready to start again.
In general the more registers a CPU has available, the faster it can work.
Each register has it’s own name:
Each segment register has its own special use.
Nine individual bits of the status register are used as control flags (3 of them) and status flags (6 of them) and remaining 7 are not used. A flag can only take on the values 0 and 1. We say that the flag is set, if it has the value 1.
Copies second operand(source) to first operand(destination).
MOV AX,5
Here, 5 is copied to AX.
Memory: [AX], [BX], [BX+SI+5] etc. Immediate: 5, 10, 1001b, 3Fh etc
MOV [BX], DS
MOV AX, DS
MOV DS, AX
Variable is a memory location, for a programmer it is much easier to have some value be kept in a variable names “Some_Variable” instead of keeping it at 2A12:2122B, especially when one have more number of variables in his program.
The compiler supports two types of compilers: BYTE and WORD.
name DB value
name DW value
DB - Define Byte
DW - Define Word
name - Can be any letter or digit combination, though, it must start with a letter.
value - It can be any numeric value in any supported numbering system (hex, bin or dec)
See a sample program
ORG 100h
is a compiler directive (it tells compiler how to handle the source code)
This directive is very important when you work with variables. It tells compiler the correct address for all variables when it replaces the variable names with their offsets.
Directives are never converted into any real machine code.
Arrays can be seen as chains of variables. A text string is an example of a byte array, each character is presented as an ASCII code value.
Exampes of array definition
arr1 DB 48h, 65h, 6Ch, 6Ch, 6Fh, 00h
arr2 DB 'Hello',0
arr2 is an exact copy of the arr1, when the compiler sees a string insiide quotes, it automatically converts it into a set of bytes.
Also, any element of an array can be accessed using square brackets.
For Example,
MOV AL, arr1[3]
; Alternatively
MOV SI, 3
MOV AL, arr1[SI]
If you need to declare a large array you can use DUP operator.
The syntax for DUP: ``number DUP (values)
number - number of duplicate to make (any conost value)
value - expression that DUP will duplicate
arr3 DB 5 DUP(9)
; it is equivalent to
arr3 DB 9, 9, 9, 9, 9
arr4 DB 5 DUP(1,2)
; it is equivalent to
arr3 DB 1, 2, 1, 2, 1, 2, 1, 2, 1, 2
you can use DW instead of DB if it’s required to keep values larger then 255, or smaller then - 128. DW cannot be used to declare strings
Sample Program For Arrays
In order to tell the compiler about the data type, these prefixes should be used:
BYTE PTR - for byte
WORD PTR - for word (2 bytes)
Program #1: Using the LEA to get the address (LEA: Lead Effective Address)
ORG 100h
MOV AL, VAR1 ; check value of VAR1 by moving it to AL.
LEA BX, VAR1 ; get address of VAR1 in BX.
MOV BYTE PTR [BX], 44h ; modify the contents of VAR1.
MOV AL, VAR1 ; check he value of VAR1 by moving it to AL.
RET
VAR1 DB 22h
END
Program #2: Using the OFFSET to get the address
ORG 100h
MOV AL, VAR1 ; check value of VAR1 by moving it to AL.
MOV BX, OFFSET VAR1 ; get address of VAR1 in BX.
MOV BYTE PTR [BX], 44h ; modify the contents of VAR1.
MOV AL, VAR1 ; check he value of VAR1 by moving it to AL.
RET
VAR1 DB 22h
END
Both of them have the same functionality
NOTE: Only BX, SI, DI, BP can be used inside square brackets (as memory pointers)
In assembly language there are not strict data types, so any variable can be presented as an array.
Constants are just like variables, but they exist only until your program is compiled (assembled). After
definition of a constant its value cannot be changed. To define constants EQU directive is used:
Syntax: name EQU <any expression>
k EQU 5
MOV AX, k
; It is equivalent to
MOV AX, 5
NOTE – Variable can be viewed in any numbering system::
Before going further into assembly programming, there are some instructions that you must be aware of.
Every instruction of a program has to operate on a data. The method of specifying the data to be operated by the instruction is called addressing.
The 8086 has 12 addressing modes and they can be classified into following five groups.
Group | Group 1 | Group 2 | Group 3 | Group 4 | Group 5 |
---|---|---|---|---|---|
Addressing modes | Registers and immediate data | Memory Data | I/O Ports | Relative Addressing | Implied Addressing |