





						  Program Structure



		 3.1  Introduction

		 The Program Structure directives let a	 programmer
		 define	 the organization that a program's code	and
		 data will have	when loaded into memory.

		 There	are   the   following	Program	  Structure
		 directives:

		      SEGMENT Segment Definition
		      ENDS	      Segment End
		      END	      Source File End
		      GROUP   Segment Groups
		      ASSUME  Segment Registers
		      ORG	      Segment Origin
		      EVEN	      Segment Alignment
		      PROC	      Procedure	Definition
		      ENDP	      Procedure	End

		 The following sections	describe  these	 directives
		 in detail.  They also describe	the Instruction	Set
		 directives that define	which instruction set is to
		 be used during	assembly.


		 3.2  Source Files

		 Every assembly	language program consists of one or
		 more source files.  A source file is simply a text
		 file that  contains  statements  that	define	the
		 program's   data  and	instructions.	MASM  reads
		 source	 files	and  assembles	the  statements	 to
		 create	``object modules'' that	can be prepared	for
		 execution by the system linker.

		 All source files have the same	 form  --  zero	 or
		 more  program	``segments''  followed	by  an	END
		 statement.  The END statement,	required  in  every
		 source	 file,	signals	the end	of the source file.
		 It also provides a way	to define the program entry



								3-1


















	      XENIX Macro Assembler Reference Manual



	      point.   All other statements in a source	file are
	      optional.

	      The following example illustrates	the source  file
	      format.  It is a complete	assembly language module
	      that uses	XENIX system calls to print the	 message
	      ``Hello.''  on  the  user	 terminal.  Linking this
	      module with the standard C  runtime  library  will
	      produce a	complete executable program.

































	      3-2



















						  Program Structure



		 _DATA	 segment		 ; Program Data	Segment
		 HELLO	 db	 "Hello.", 10
		 TTY	 db	 "/dev/tty", 0	  FD	  dw	  0
		 _DATA	 ends DGROUP  group _DATA

		 EXTRN	 _open:NEAR		 ; External entry points
		 EXTRN	 _close:NEAR		EXTRN	_write:NEAR
		 EXTRN	 _exit:NEAR

		 _TEXT	 segment		 ; Program Code	Segment
			 assume	cs:TEXT, ds:DGROUP, ss:	DGROUP,	es: DGROUP

		 PUBLIC	 _main

		 _main:				 ; Program Entry Point

			 push	 2		 ;  fd = open("/dev/tty", 2)
			 push	 OFFSET	DGROUP:TTY
			 call	 _open		      add     sp, 4
			 mov	 FD, ax

			 push	 7		 ; write(fd, &hello, 7)
			 push	 OFFSET	DGROUP:HELLO
			 push	 FD		     call    _write
			 add	 sp, 6

			 push	 FD		 ; close(fd)
			 call	 _close		add	sp, 2

			 push	 0		 ; exit(0)
			 call	 _exit

		 _TEXT	 ends

			 end

		 The main features of this source file are:

		 1.   The SEGMENT  and	ENDS  statements,  defining
		      segments named _DATA and _TEXT.



								3-3


















	      XENIX Macro Assembler Reference Manual



	      2.   The GROUP statement defining	a  group  DGROUP
		   which contains the data segment _DATA.

	      3.   The variables HELLO	and  TTY  in  the  _DATA
		   segment,  defining the string to be displayed
		   and the name	of the file which is  opened  to
		   do this

	      4.   The instruction  label  _main  in  the  _TEXT
		   segment  and	 its  PUBLIC  declaration, which
		   provides the	necessary entry	 point	for  the
		   runtime library to call

	      5.   The ASSUME statements in the	_DATA and  _TEXT
		   segments,  defining	which  segment registers
		   will	  be   associated   with   the	 labels,
		   variables,  and  symbols  defined  within the
		   segments


	      3.3  Instruction Set Directives

	      Syntax

		   .8086	.8087	      .186	   .286c
		   .286p      .287

	      The instruction set directives enable/disable  the
	      instruction  sets	 for  the given	microprocessors.
	      When a directive is given, MASM will recognize and
	      assemble	any subsequent instructions belonging to
	      that   microprocessor.	The   instruction    set
	      directives,  if  used,  should  be  placed  at the
	      beginning	 of  the  program  source  file.    This
	      ensures  that  all  instructions	in  the	file are
	      assembled	using the same set of directives.

	      Under XENIX, MASM	assembles non-protected	286  and
	      287 instructions by default, so the .286c	and .287
	      directives are not required.



	      3-4


















						  Program Structure



		 The   .8086   directive   enables   assembly	 of
		 instructions for the 8086 microprocessor.  It also
		 disables assembly of  186  and	 286  instructions.
		 Similarly, the	.8087 directive	enables	assembly of
		 instructions	for   the   8087   floating   point
		 coprocessor   and   disables	assembly   of	287
		 instructions.

		 The   .186   directive	  enables    assembly	 of
		 instructions  for  the	 186  microprocessor.  This
		 directive should be used for programs that will be
		 executed by an	186 microprocessor.

		 The  .286c  directive	enables	 assembly  of  non-
		 protected instructions	for the	286 microprocessor.
		 (These	are identical to the 186 instructions).	The
		 .286p	directive enables assembly of the protected
		 instructions of  the  286.   The  .286c  directive
		 should	be used	with programs that will	be executed
		 by a 286 microprocessor  but  do  not	access	the
		 286's	  protected   instructions.	The   .286p
		 directive can be used with programs that  will	 be
		 executed by a 286.

		 The   .287   directive	  enables    assembly	 of
		 instructions	 for   the   287   floating   point
		 coprocessor.  This directive should be	 used  with
		 programs that have floating point instructions	and
		 will be executed by a 286 microprocessor.

		 Even though a source file may contain the .8087 or
		 .287  directive,  MASM	 also requires the -r or -e
		 option	in the MASM command line to define  how	 to
		 assemble  floating  point  instructions.   The	 -r
		 option	 directs  the  assembler  to  generate	the
		 actual	 instruction  code  for	 the floating point
		 instruction. The -e option, which is  the  default
		 option	 on  XENIX,  directs  MASM  to generate	the
		 instruction  codes  which  changed  into  software
		 interrupts   at  program  link	 time.	 If  a	287



								3-5


















	      XENIX Macro Assembler Reference Manual



	      microprocessor is	 present  when	the  program  is
	      executed,	 these	software  interrupts are changed
	      into  actual  287	 instructions;	otherwise,   the
	      software	interrupts are processed by the	floating
	      point emulator.


	      3.4  SEGMENT and ENDS Directives

	      Syntax

		   name	SEGMENT	align  combine	'class'
		   name	ENDS

	      The SEGMENT and ENDS directives mark the beginning
	      and  end	of a program segment.  A program segment
	      is a collection of instructions and/or data  whose
	      addresses	 are  all  relative  to	the same segment
	      register.

	      The name defines the name	of  the	 segment.   This
	      name  can	 be  unique or be the same name	given to
	      other segments  in  the  program.	  Segments  with
	      identical	names are treated as the same segment.

	      The  align,  combine,  and  class	 options  define
	      program  loading	instructions that are to be used
	      by the linker when forming the executable	program.
	      These options are	described later.

	      Segments can be nested.  When  MASM  encounters  a
	      nested  segment,	it temporarily suspends	assembly
	      of the enclosing segment,	and begins  assembly  of
	      the  nested  segment.  When the nested segment has
	      been assembled, MASM  continues  assembly	 of  the
	      enclosing	 segment.   Overlapping	segments are not
	      permitted.





	      3-6



















						  Program Structure



		      Example

		      SAMPLE_TEXT     segment word public 'CODE'
		      _main   proc far				  .
			      .		      .

		      CONST   segment word public 'CONST'  ; nested segment
		      seg1    dw      ARRAY_DATA
		      CONST   ends			      ;	end nesting

			      mov     es, seg1
			      push    es
			      mov     ax, es:pointer
			      push    ax
			      call    _printf
			      add     sp, 4			  .
			      .		       .		ret
		      _main   endp	SAMPLE_TEXT	ends

		 This	 example     contains	  two	  segments:
		 ``SAMPLE_TEXT''   and	``CONST''.   The  ``CONST''
		 segment  is  nested  within  the   ``SAMPLE_TEXT''
		 segment.

		 __________________________________________________
		 Note

		   Although a given segment name can be	 used  more
		   than	  once	in  a  source  file,  each  segment
		   definition using  that  name	 must  have  either
		   exactly  the	same attributes, or attributes that
		   do not conflict.
		 __________________________________________________









								3-7



















	      XENIX Macro Assembler Reference Manual



	      Program Loading Options

	      The align	option	defines	 the  alignment	 of  the
	      given segment.  The alignment defines the	range of
	      memory addresses from which a starting address for
	      the segment can be selected.  It can be any one of
	      the following:

		   BYTE	   use any byte	address
		   WORD	   use any word	address	(2 bytes/word)
		   PARA	   use paragraph addresses (16 bytes/paragraph)
		   PAGE	   use page addresses (1024 bytes/page)

	      If no align is given, PARA  is  used  by	default.
	      The  actual  start  address  is  computed	when the
	      program is loaded, and the linker	guarantees  that
	      the address will be on the given boundary.

	      The combine option defines how to	combine	segments
	      having  the  same	 name.	It can be any one of the
	      following:

		   PUBLIC
			       Concatenates all	segments  having
			       the same	name and forms a single,
			       contiguous     segment.	     All
			       instruction and data addresses in
			       the new segment are relative to a
			       single  segment register, and all
			       offsets are adjusted to represent
			       the  distance  from the beginning
			       of the new segment.

		   STACK
			       Concatenates all	segments  having
			       the same	name and forms a single,
			       contiguous     segment.	     All
			       addresses  in the new segment are
			       relative	 to   the   SS	 segment
			       register.  The Stack Pointer (SP)



	      3-8


















						  Program Structure



				  register is set to an	address	 in
				  the segment.

		      COMMON
				  Creates overlapping  segments	 by
				  placing the start of all segments
				  having the same name at the  same
				  address.    The   length  of	the
				  resulting area is the	 length	 of
				  the	 longest    segment.	All
				  addresses  in	 the  segments	are
				  relative   to	  the	same   base
				  address.

		      MEMORY
				  Places all  segments	having	the
				  same name in the highest physical
				  segment in memory.  If more  than
				  one  MEMORY segment is given,	the
				  segments are overlapped  as  with
				  COMMON segments.

		      AT address
				  Causes  all  label  and  variable
				  addresses  defined in	the segment
				  to  be  relative  to	the   given
				  address.   The address can be	any
				  valid	expression,  but  must	not
				  contain a forward reference, that
				  is,  a  reference  to	 a   symbol
				  defined later	in the source file.
				  AT segments typically	contain	 no
				  code	  or	initialized   data.
				  Instead, they	 represent  address
				  templates that can be	placed over
				  code or data already	in  memory,
				  such	as  code  and data found in
				  ROM  devices.	 The   labels	and
				  variables  in	the AT segments	can
				  then be used to access the  fixed



								3-9


















	      XENIX Macro Assembler Reference Manual



			       instructions and	data.

	      If  no  combine  is  given,  the	segment	 is  not
	      combined.	  Instead,  it receives	its own	physical
	      segment when loaded into memory.

	      __________________________________________________
	      Note

		The linker requires at least one  stack	 segment
		in a program.
	      __________________________________________________


	      The class	option defines which segments are to  be
	      loaded  in  contiguous memory. Segments having the
	      same class name are loaded into memory  one  after
	      another.	All segments of	a given	class are loaded
	      before segments of any  other  class.   The  class
	      name must	be enclosed in single quotation	marks.

	      Example

			   assume  cs:_TEXT
		   _TEXT   segment word	public 'CODE'
			   .		     .		       .
		   _TEXT   ends

	      This example illustrates the  general  form  of  a
	      text  segment  for  a  small  module program.  The
	      segment name is ``_TEXT''.  The segment  alignment
	      and  combine  type  are  ``word''	 and ``public,''
	      respectively.  The class is ``CODE.''









	      3-10



















						  Program Structure



		 3.5  END Directive

		 Syntax

		      END

		 The END directive marks the  end  of  the  module.
		 The  assembler	 ignores  any  statements following
		 this directive.

		 Examples

		      end      end     _start



		 3.6  GROUP Directive

		 Syntax

		      name  GROUP  seg-name,,,

		 The GROUP directive associates	a group	 name  with
		 one  or  more	segments, and causes all labels	and
		 variables defined in the given	 segments  to  have
		 addresses  that  are  relative	to the beginning of
		 the group instead  of	to  the	 beginning  of	the
		 segments  in which they are defined.  The seg-name
		 must be the name of a segment	defined	 using	the
		 SEGMENT  directive, or	a SEG expression.  The name
		 must be unique.

		 The GROUP directive does not affect the  order	 in
		 which	segments  of  a	 group are loaded.  Loading
		 order depends on each segment's class,	or  on	the
		 order the object modules are given to the linker.

		 Segments in a group do	not have to be	contiguous.
		 This means that segments that do not belong to	the
		 group can be loaded between segments that do.	The



							       3-11


















	      XENIX Macro Assembler Reference Manual



	      only  restriction	 is that the distance (in bytes)
	      between the first	byte in	the first segment of the
	      group  and  the last byte	in the last segment must
	      not exceed 65,535.  If the segments of a group are
	      contiguous,  the group can occupy	up to 64 K bytes
	      of memory.

	      Group names can be used with the ASSUME  directive
	      and as an	operand	prefix with the	segment	override
	      operator (:).

	      __________________________________________________
	      Note

		A group	name must not be used in more  than  one
		GROUP  directive in any	source file.  If several
		segments within	the source file	 belong	 to  the
		same  group,  all segment names	must be	given in
		the same GROUP directive.
	      __________________________________________________

	      Example

		   DGROUP  group   _DATA, _BSS
			   assume  ds:DGROUP

		   _DATA   segment word	public 'DATA'
			     .				       .
			     .			    _DATA   ends
		   _BSS	   segment word	public 'BSS'
			     .				       .
			     .			    _BSS    ends
			   end



	      3.7  ASSUME Directive

	      Syntax



	      3-12



















						  Program Structure



		      ASSUME  seg-reg :	seg-name  ,,,
		      ASSUME NOTHING

		 The ASSUME directive  selects	the  given  segment
		 register   seg-reg   to  be  the  default  segment
		 register for all labels and variables	defined	 in
		 the   segment	 or   group   given   by  seg-name.
		 Subsequent references to  the	label  or  variable
		 will  automatically  assume  the selected register
		 when the effective address is computed.

		 The  ASSUME  directive	 can   define	up   to	  4
		 selections:  one  selection  for  each	of the four
		 segment registers. The	seg-reg	can be any  one	 of
		 the  segment  register	 names:	 CS, DS, ES, or	SS.
		 The seg-name must be one of the following:

		  -   The name of a segment previously defined with
		      the SEGMENT directive.

		  -   The name of a group previously  defined  with
		      the GROUP	directive.

		  -   The keyword NOTHING.

		 The keyword NOTHING cancels  the  current  segment
		 selection.    The   directive	``ASSUME  NOTHING''
		 cancels all register selections made by a previous
		 ASSUME	statement.

		 __________________________________________________
		 Note

		   The segment override	operator (:) can be used to
		   override  the  current segment register selected
		   by the ASSUME directive.
		 __________________________________________________





							       3-13



















	      XENIX Macro Assembler Reference Manual



	      Examples

		   assume cs:code
		   assume cs:cgroup,ds:dgroup,ss:nothing,es:nothing
		   assume nothing



	      3.8  ORG Directive

	      Syntax

		   ORG expression

	      The ORG directive	sets  the  location  counter  to
	      expression.    Subsequent	  instruction  and  data
	      addresses	begin at the new value.

	      The expression must resolve to an	absolute number,
	      i.e.,  all  symbols used in the expression must be
	      known on the first pass  of  the	assembler.   The
	      location counter symbol ($) can also be used.

	      Examples

		   org	   120H		 org	 $+2



	      3.9  EVEN	Directive

	      Syntax

		   EVEN

	      The  EVEN	 directive  aligns  the	 next  data   or
	      instruction  byte	 on  a	word  boundary.	  If the
	      current value of the location counter is odd,  the
	      directive	 increments  the  location counter to an
	      even  value  and	generates  one	NOP  instruction



	      3-14


















						  Program Structure



		 (90h).	 If  the  location counter is already even,
		 the directive is ignored.

		 The EVEN directive  must  not	be  used  in  byte-
		 aligned segments.

		 Example

			      org     0		  test1	  db	  1
			      even	test2	dw	513

		 In this  example,  EVEN  increments  the  location
		 counter  and  generates  a  NOP instruction (90h).
		 This means the	offset of ``test2'' is 2, not 1.


		 3.10  PROC and	ENDP Directives

		 Syntax

		      name   PROC   type		 statements
		      name   ENDP

		 The PROC and ENDP directives  mark  the  beginning
		 and end of a procedure.  A procedure is a block of
		 instructions  that  form  a  program	subroutine.
		 Every	procedure  has	a name with which it can be
		 called.

		 The name must be a  unique  name,  not	 previously
		 defined  in the program.  The optional	type can be
		 either	NEAR or	FAR.  NEAR is assumed if no type is
		 given.	  The  name  has  the  same attributes as a
		 label and can be used as an  operand  in  a  jump,
		 call, or loop instruction.
		 Any number of statements can  appear  between	the
		 PROC  and  ENDP  statements.  The procedure should
		 contain at  least  one	 ret  statement	 to  return
		 control  to  the point	of call.  Nested procedures



							       3-15



















	      XENIX Macro Assembler Reference Manual



	      are allowed.

	      Example

		   _main   proc	   near		      push    bp
			   mov	   bp, sp
			   push	   si		      push    di
			   mov	   ax, offset DGROUP: string
			   push	   ax
			   call	   _printf
			   add	   sp, 2	      pop     di
			   pop	   si
			   mov	   sp, bp
			   pop	   bp			     ret
		   _main   endp



























	      3-16























		 Chapter 3


		 Program Structure
		 __________________________________________________



		 3.1  Introduction  3-1

		 3.2  Source Files  3-1

		 3.3  Instruction Set Directives  3-4

		 3.4  SEGMENT and ENDS Directives  3-6

		 3.5  END Directive  3-11

		 3.6  GROUP Directive  3-11

		 3.7  ASSUME Directive	3-13

		 3.8  ORG Directive  3-14

		 3.9  EVEN Directive  3-15

		 3.10 PROC and ENDP Directives	3-15





























