Creating a COM object in ASM
Copyright © Dec 27, 2000 by Ernest Murphy ernie@surfree.com
For educational use only. All commercial use only by written license.
Revised December 27 for inclusion with MASM32
Sample code for this article is available at: .../COM/examples/MyCom
The Build DLL setting used for Quick Editor is at .../COM/BIN/BLDDLL.BAT
Abstract:
---------------------------------------------------------------------------------------------------------------------
The COM (Component Object Model) is based on a non-specific implementation standard (neither platform nor language is specified), but real world constraints of real computers add practical considerations to this standard.
Here, a simple yet fully functional COM object in-process server for the WinTel platform will be created. It will be tested in a Visual Basic environment to assure it's compliance to the standard.
---------------------------------------------------------------------------------------------------------------------The programs discussed in here is designed be assembled with the MASM32 package. The Visual Basic client was written in VB6, which still sets its form version as 5. If one edits the VB5 version to VB4 it might work in earlier versions, but this has not been tested. It should also work in any VBA application, but again this has not been tested.
I will make no attempt to explain the basics of the Component Object Model nor the COM contract here. I'm going to assume you are familiar with both, and such details as the vtable, the vtable pointer and such. If you are not so familiar, I suggest you check the previous articles here, or the best source of explanation is Dale Rogerson's "Inside COM" (see bibliography).
Since COM objects must run not in some textbook but on real computers, they must follow certain implementation standards specific to that computer and operating system. For a WinTel in-process server, there already exists a well-defined standard to load a blob of code and unload it when no longer needed: the humble dynamic link library.
From one view, all an in-process COM server consists of is a .DLL with a set of 5 well defined exports. These are:
DllMain: This is the first routine in any dll. It is called when the library is loaded. It should check that the client (the calling app) wants an in-process server, and fail if not. (COM supports other instancing choices, but this app does not.)
DllRegisterServer: The registry holds data on every COM object installed on the system. This routine self-registers the component in the registry. It is how regsvr32.exe can register a component, regsvr32 just calls this export and displays the return value.
DllUnregisterServer: When no longer needed, a component should be able to unregister itself. Again, regsvr32.exe will call this export to clean out the registry.
DllCanUnloadNow: Global variables in the server keep track of any objects created, and also if a lock was placed on the server (explained in IClassFactory.LockServer). The client app will periodically call this export to check if the server is no longer needed and then unload it.
DllGetClassObject: Finally the magic COM export. This export takes 3 parameters, the GUID of the component to be created, the GUID of that component's interface to be created, and a pointer to the object thus created. If either the component or the interface requested are not supported, this routine fails.
The first 4 exports are straightforward. DllGetClassObject is the new strange thing, and needs further comment.
By now, one thing you should have noticed is that COM is about nothing if it isn't about indirection. That indirection gives the power to the methods. In practice, the object returned from DllGetClassObject is not the object we seek: it is a "class factory" object. A class factory object is one that knows how to instance (create) another class. This first lever of indirection allows the details of the object's creation to be specified. If it simply and directly returned a pointer to the object, then the object already exists, thus we cannot set and control any parameters in its constructor.
DllGetClassObject returns a IClassFactory interface. IClassFactory inherits from IUnknown (of course, every interface does), as has these two member functions:
HRESULT CreateInstance(
IUnknown * pUnkOuter, //Pointer to outer object when part of an
// aggregate REFIID riid,
//Reference to the interface identifier
oid** ppvObject); //Address of output variable that receives
// the interface pointer requested in riid
HRESULT LockServer(BOOL fLock);
//Increments or decrements the lock count
LockServer keeps the class factory instanced (helpful if one has to make numerous object copies). CreateInstance is the workhorse here, it is used to creates the object's "working" interface. Each class creates a class factory that solely knows how to create that particular class, since CreateInstance does not include a class reference ID. The class ID was specified when we created this class factory interface. Thus, a class factory only knows how to create interfaces of the class of which it is a member. It's sole purpose is to create the desired object class.
This indirection is useful if the class needs some special initiation. IClassFactory is necessary to handle aggregation (a topic I will have nothing more to add, and which the component made here does not support). Already, there is a definition for IClassFactory2, which checks licensing for the component.
To actually get your COM object, a client calls DllGetClassObject by invoking the CoCreateInstance API function. This API takes care of handling the class factory for you, and return the desired interface pointer. If you just need a plain vanilla object, this is the function to use. The CoGetClassObject API will return a pointer to IClassFactory is your class needs further creation parameters. Internally, CoCreateInstance itself calls CoGetClassObject and instance the class through IClassFactory, so you always need it define a class factory object creation interface.
Classes, Objects, you and THIS
---------------------------------------------------------------------------------------------------------------------I'm going to digress into some internal implementation details on how C++ handles objects. One can be a proficient C++ programmer and never fully understand this, as it is a level of complexity the compiler handles fully for you. However, the designers of COM took full advantage of the concept, and we need to understand this.
The new fundamental concept the ++ in C++ adds to the language is the concept of classes and objects. And object is an instance of a class. Let's explore what this means on a machine level. When we write a conventional program in asm, we depend on the compiler to create code and data segments for us. One area in memory is the code we execute, another holds the data we need. What the ++ in C++ does for you is dynamically reallocate data memory at run time, giving each instance of a class, each small segment of code it's own data segment. In a very real way, the instance of a class IS this data segment. Data specific to each instance (or copy) of the class is stored in this dynamic data area.
Perhaps you have heard that C++ passes an extra hidden parameter in function calls on an object, the THIS value. Whoever named this concept not only understood what was required, but had a good sense of humor. When one is writing low level code for an object (something the compiler usually does for you in C++), the first question one will have is "which object am I the code for?"
THIS is which object, one is always working with THIS object.
THIS is simply a pointer to the data memory area for THIS instance of the class. When an objects class function is called, THIS is silently passed on. When the private data of the object is accessed, the class code area uses THIS to reference where that data is for THIS instance.
Why is THIS important? Simply this: a pointer to a COM interface is same thing as the pointer to THIS. My Grammar checker is going nuts at me now, it may not be correct English but this is how THIS works.
In use, COM is simply a specification of interfaces. It says nothing about how they are actually implemented in code. In fact, it is written so such details are not defined such that any higher level language that implements these interfaces may be a COM server.
I defined my "class" data areas (the "objects") as thus:
; declare the ClassFactory object structure
ClassFactoryObject STRUCT
lpVtbl DWORD 0 ; function table pointer
nRefCount DWORD 0 ; object and reference count
ClassFactoryObject ENDS
; declare the MyCom object structure
MyComObject STRUCT
lpVtbl DWORD 0 ; function table pointer
nRefCount DWORD 0 ; reference count
nValue DWORD 0 ; interface private data
MyComObject ENDS
The first point is I have great latitude in defining this structure. The only element that the COM contract imposes on it is that it contain a DWORD pointer to the vtable of functions. I also use it to hold the private data for each interface, that being the reference counts and values. nValue is the Value the MyCom interface works with, as we will see next. The dynamic memory for these structures is allocated with the API function CoTaskMemAlloc and released with CoTaskMemFree. These are exported by the ole32.dll, which has many other useful exports, such as checking for GUID equality, and converting GUIDs to and from character strings.
MyCom, a Simple Interface
---------------------------------------------------------------------------------------------------------------------To illustrate the workings of COM interface, we will create a simple interface called IMyCom (all interfaces in COM should have the "I" prefix, for Interface). This interface, like all COM interfaces, derives from IUnknown. This simple means its first three functions are QueryInterface, AddRef, and Release.
To add our custom interface members, we add the next three function members (in C styled prototypes):

HRESULT SetValue(long *pVal);
HRESULT GetValue(long newVal);
HRESULT RaiseValue(long newVal);
These functions allow us to check the functionality of our interface. SetValue and GetValue allow us to set and read a data member of our interface. RaiseValue is a member function that adds a value to this data. Thus, we can assure ourselves we really are accessing a fully functional object from VB.
These structures in memory look like this:

The client just holds a pointer (ppv) to this distributed structure. (The generic name for the ppv pointer comes from it's C++ definition of "pointer to pointer to (void).") The "object" data blob is dynamically allocated and initiated when we create instance the class. The vtable and the server functions are static, they are defined at compilation time.
One point to notice here is the vtable holds pointers to the functions, not the functions themselves. Thus, we can "override" an inherited function simply by changing what routine the vtable points to. I have to say "override" in quotes because the class definitions are just a mental concept in this ASM implementation, I did not write classes and inherit them in other classes. But the concept is the same.
In the example to follow, there are function "overrides" performed this way. The IClassFactory and IMyCom both inherit QueryInterface from IUnknown. But as they support different interfaces, they need different routines to return different results. Thus, there are two QueryInterface routines (QueryInterfaceCF and QueryInterfaceMC) pointed to from the different vtables.
For simplicity, the AddRef and Release are also customized as to which interface they support. This is not an issue with AddRef, but the Release functions differ in that the MyCom Release has to know how to destroy the MyCom object. A further refinement for the future is to declare all objects in the same way so the same function can delete them, and we could use a single implementation of Release for everything.
A minor point on calling COM interfaces from within the server: There is entirely no reason to go through the object table to get the pointer to a member function: INSIDE the server it's just another function, you can invoke it directly, as the code knows exactly where it is at compile time. Just think of it as an advanced optimization technique. This works because the calls all contain THIS (the object pointer) as a parameter, and hence answer the question "which" object with "THIS" object.
Another minor point here, each time a client makes a call on our object, it walks through memory from ppv to pv to the vtable to get the address of the function to invoke. Should you want to, you may change this final pointer during the object's lifetime to make it exhibit different characteristics. There is nothing in the COM contract to prohibit it, though it is akin to self-modifying code. I only mention it because previous writings claimed this not to be true.
Type Libraries
---------------------------------------------------------------------------------------------------------------------On WinTel platforms, every COM interface gets information on it stored in the system registry. These interfaces are created in something called "Interface Definition Language" (IDL) that may be compiled by MIDL (Microsoft Interface Definition Language) compiler, a command line app. I'm not going to pretend to you I fully understand IDL well enough to sit down in Notepad and define interfaces. But, since the MIDL tool is only shipped with MSVC, if you have it then you have VC, so I used the Visual Studio tool to create my original interface definition file.
I started an ATL project named MyComApp, and insert a new ATL object and choose Simple Object named MyCom (the same terms my app will use). The class wizard then created a blank IMyCom interface. The ATL Attributes need to be set to Single Thread, Custom Interface, and No Aggregation. Then I created the interface by right clicking the IMyCom interface in the Class browser, and using Add Property to insert the SetValue, GetValue properties, and RaiseValue method. Then I saved and closed the app and copied the MyComApp.idl file to my assembly program folder.
Here is the output of the VC ATL interface definition file (.idl):
// MyCom.idl : IDL source for MyCom.dll
//
// This file will be processed by the MIDL tool to
// produce the type library (MyCom.tlb) and marshalling code
import "oaidl.idl";
import "ocidl.idl";
[
object,
uuid(F8CE5E41-1135-11d4-A324-0040F6D487D9),
helpstring("IMyCom Interface"),
pointer_default(unique)
]
interface IMyCom : IUnknown
{
[propget, helpstring("property Value")]
HRESULT Value([out, retval] long *pVal);
[propput, helpstring("property Value")]
HRESULT Value([in] long newVal);
[helpstring("method Raise")]
HRESULT Raise(long Value);
};
[
uuid(F8CE5E42-1135-11d4-A324-0040F6D487D9),
version(1.0),
helpstring("MyComApp 1.0 Type Library")
]
library MyComLib
{
importlib("stdole32.tlb");
importlib("stdole2.tlb");
[
uuid(F8CE5E43-1135-11d4-A324-0040F6D487D9),
helpstring("MyCom Class")
]
coclass MyCom
{
[default] interface IMyCom;
};
};
This file can be used as a prototype for further interface definitions. Notice it contains 3 GUIDs, one each for the Interface, the coclass, and the type library. These MUST be changed and distinct for new applications.
This definition file should look pretty well self-explanatory, except for the interface itself:
[propget, helpstring("property Value")] HRESULT Value([out, retval] long *pVal); [propput, helpstring("property Value")] HRESULT Value([in] long newVal); [helpstring("method Raise")] HRESULT Raise(long Value);
Here are the same interfaces defined in MASM:
GetValue PROTO :DWORD, :DWORD
SetValue PROTO :DWORD, :DWORD
RaiseValue PROTO :DWORD, :DWORD

BIG difference... but for a simple reason. Interfaces written for type libraries are as general as can be, and are directed at clients such as Visual Basic, and VB is designed to hold the programmer's hand as much as possible. To keep interfaces simple to VB users, the concept of a "property" is used. One may "set" or "get" a property value, so these two functions seem to be the same to a VB programmer (the object reference just moves to the other side of the equate operator). A "method" makes some change or performs some action involving the object.
To create the type lib, use MIDL on a command line like so:
MIDL MyCom.idl
This produces several output files which you can mostly ignore, and most importantly MyCom.tlb, our type library. This library should be added to the dll resource file with
1 typelib MyCom.tlb
Making it the first resource element is important, as later on we will be using the LoadTypeLib API function to extract this library, and this function expects to find the library at position 1 (unless told to do otherwise). So for simplicity, we keep it at position 1.
Registering the Component
---------------------------------------------------------------------------------------------------------------------The DllRegisterServer and DllUnregisterServer take care of registering the component for us. A dll (or ocx, which is really just a dll with a fancier extension). These registry entries are made:
HKEY_CLASSES_ROOT\CMyCom
(Default) "CMyCom simple client"
HKEY_CLASSES_ROOT\CMyCom\CLSID
(Default) "{A21A8C43-1266-11D4-A324-0040F6D487D9}"
HKEY_CLASSES_ROOT\CLSID\{A21A8C43-1266-11D4-A324-0040F6D487D9}
(Default) "CMyCom simple client"
HKEY_CLASSES_ROOT\CLSID\{A21A8C43-1266-11D4-A324-0040F6D487D9}\CMyCom
(Default) "CMyCom"
HKEY_CLASSES_ROOT\CLSID\{A21A8C43-1266-11D4-A324-0040F6D487D9}\InprocServer32
(Default) "C:\MASM32\MYCOM\MYCOM.DLL"
ThreadingModel "Single"
HKEY_CLASSES_ROOT\TypeLib\{A21A8C42-1266-11D4-A324-0040F6D487D9}
(Default) (value not set)
HKEY_CLASSES_ROOT\TypeLib\{A21A8C42-1266-11D4-A324-0040F6D487D9}\1.0
(Default) "MyCom 1.0 Type Library"
HKEY_CLASSES_ROOT\TypeLib\{A21A8C42-1266-11D4-A324-0040F6D487D9}\1.0\0
(Default) (value not set)
HKEY_CLASSES_ROOT\TypeLib\{A21A8C42-1266-11D4-A324-0040F6D487D9}\1.0\0\win32
(Default) " C:\masm32\COM\MyCom \MYCOM.DLL"
HKEY_CLASSES_ROOT\TypeLib\{A21A8C42-1266-11D4-A324-0040F6D487D9}\1.0\FLAGS
(Default) "O"
HKEY_CLASSES_ROOT\TypeLib\{A21A8C42-1266-11D4-A324-0040F6D487D9}\1.0\HELPDIR
(Default) "C:\masm32\COM\MyCom"
One key value here is variable, that is the path and name of the server dll itself. On my system I placed it at "C:\MASM32\COM\MYCOM\MYCOM.DLL" This was detected when I registered the component, as one other function of DllRegisterServer is to discover where the dll itself is stored by invoking GetModuleFileName.
This is a lot of information for one little server. But all we need know to instance our server is pass the ID of {A21A8C43-1266-11D4-A324-0040F6D487D9} and a valid interface ID to CoCreateInstance. We need not know where it the component is, nor place it in a special directory. The CoCreate API's will trace through the registry settings, starting with the CLSID to discover all it needs to know to create the component. Once it has the component, it can load the type library from that to learn more if need be.
Fortunately for us, the last 5 registry entries are done for us via the RegisterTypeLib API. In DllRegisterServer we call a series of registry functions to set the first 5 keys and values, then invoke RegisterTypeLib. DllUnregisterServer just winds through this structure and deletes all the entries it made, then invokes UnRegisterTypeLib. When deleting keys, do take care NOT to delete the entire HKEY_CLASSES_ROOT\CLSID\ tree, as you will completely mess up your system and partially uninstall every other activeX component on your system.
A type library itself is defined as a "dense black blob" of binary data. The sole property of its internal structure revealed by Microsoft is the first 4 bytes shall be the ASCII code for "MSFT." To learn what is inside, API methods must be employed. Again, this keeps the COM contract language neutral.
Implementing the Unknown
---------------------------------------------------------------------------------------------------------------------
MyCom is a very simple object, it only impliments two interfaces, IUnknown and IMyCom. Since these two interfaces overlap, the returned ppv pointer need not be cast to either interface, and our very simple object structure will suffice. If you continue on to the CoLib (Component Library), you will see a much more involved object structure is required if multiple, non-overlapping interfaces are supported.
Object lifetime is handled by the IUnknown interface. These three seemingly simple methods of AddRef, Release, and QueryInterface are quite powerful, and are used such that the functionality of each is never duplicated in another section.
This non duplication of function is perhaps the impetus in why IUnknown was named such. When DllGetClassObject is invoked, the object CLSID and a specific interface IID are passed in to define what needs to be created. Think for a second: we are in effect asking DllGetClassObject to perform a QueryInterface on the object before we create it. That is not what happens, since we do not want to duplicate functionality (i.e., two identical QueryInterface implementations, one in the QueryInterface itself, one in DllGetClassObject). If nothing else, we would not want to maintain two sections of code that should have similar output.
Instead, when DllGetClassObject is invoked, we simply create the object defined by the CLSID. In effect, we create an unknown object. What is unknown is: Can this object support the interface we require?
That question is easily answered. DllGetClassObject will invoke QueryInterface on the unknown object. If it truly supports the interface, this reference is returned. If it does not support it, the object is deleted, and the DllGetClassObject returns a fail code.
AddRef is quite simple to implement. Since we have a simple object structure, and "this" is the base address of this structure, we can directly access all members of the object.
AddRef_MC proc this_:DWORD
mov eax, this_
inc (MyComObject ptr [eax]).nRefCount
mov eax, (MyComObject ptr [eax]).nRefCount
ret ; note we return the object count
AddRef_MC endp
AddRef is a bit unusual in that it does not return a HRESULT (failure code), instead it returns the object count. The return value is undefined in the COM contract, but it is traditional to return the count.
Release not only has to decrement the object count, but when this count reaches zero it must both delete the object, and delete the object count of the dll (such that when the object count goes to zero the dll may be unloaded). Again, this is a trivial implementation:
Release_MC proc this_:DWORD
mov eax, this_
dec (MyComObject ptr [eax]).nRefCount
mov eax, (MyComObject ptr [eax]).nRefCount
.IF (eax == 0)
; the reference count has dropped to zero
; no one holds reference to the object
; so let's delete it
invoke CoTaskMemFree, this_
dec MyCFObject.nRefCount
xor eax, eax ; clear eax (count = 0)
.ENDIF
ret ; note we return the object count
Release_MC endp
MyCom is also a trivial interface to implement. The MyCom object has an extra member where the 'value' property is held.
GetValue proc this_:DWORD, pval:DWORD
mov eax, this_
mov eax, (MyComObject ptr [eax]).nValue
mov edx, pval
mov [edx], eax
xor eax, eax ; return S_OK
ret
GetValue endp
SetValue proc this_:DWORD, val:DWORD
mov eax, this_
mov edx, val
mov (MyComObject ptr [eax]).nValue, edx
xor eax, eax ; return S_OK
ret
SetValue endp
RaiseValue PROC this_:DWORD, val:DWORD
mov eax, this_
mov edx, val
add (MyComObject ptr [eax]).nValue, edx
xor eax, eax ; return S_OK
ret
RaiseValue ENDP
MyCom.dll, the server code
---------------------------------------------------------------------------------------------------------------------To build the COM server use the BLDDLL.BAT file provided in "\masm32\COM\BIN" file under Quick Editor to compile. I suggest you change your editor menu settings to include a "Build DLL" option.
This project requires 5 files to build it:
MyCom.asm The main assembly code for the project
MyCom.idl Interface definition file, must be compiled to MyCom.tlb
MyCom.tlb Type Library, needed as a resource
rsrc.rc The resource file, just used to get the type library into the resource
MyCom.DEF Standard DLL export file
Once compiled, this code will do NOTHING, that is until you register it. The easiest way is to open a dos box to the folder where the dll is, and run: regsvr32 MyCom.dll. Alternatively, I have provided the bat files .r.bat and u.bat to register and unregister, respectivly, the MyCom component.
Running MyCom.dll through regsvr32 will invoke the DllRegisterServer export and write our information into the registry so we can...
Access the Server from Visual Basic
---------------------------------------------------------------------------------------------------------------------Make sure the blinds are drawn so the neighbors do not see you actually own Visual Basic. Open VB and start a Standard .Exe project. Look in the menu for Project | References and click it. Scroll through the list and check the box MyCom, and click OK. This adds the class ID to the VB application, and VB will look through the type library for further information on the server.
In the form designer, add textboxes Text1 and Text2 to the Form1, then add a command button Command1. Change the command caption to Raise. Now to the Form1 code area, and add the following:
Option Explicit
Private MC As New MyCom
Private Sub Command1_Click()
MC.Raise (Text2) Text1 = MC.Value
End Sub
Private Sub Form_Load()
Set MC = New MyCom
MC.Value = 100
Text1 = MC.Value
End Sub
Now you can run the application and test the server by clicking the Raise button. Do be careful, there is no error checking to see if you put a valid number in Text2. What you are seeing is Visual Basic running an assembly language server.
Note the sample program available for download is somewhat more complex, as it creates two copies of the server object to test. This demonstrates each object is capable of holding it's own private data information. It is quite true this server doesn't do all that much, but it is a baby step to full COM functionality from assembly.
Bibliography:
---------------------------------------------------------------------------------------------------------------------"Inside COM, Microsoft's Component Object Model" Dale Rogerson
Copyright 1997,
Paperback - 376 pages CD-ROM edition
Microsoft Press;
ISBN: 1572313498
(THE fundamental book on understanding how COM works on a fundamental level. Uses C++ code to illustrate basic concepts as it builds simple fully functional COM object)
"Automation Programmer's Reference : Using ActiveX Technology to Create Programmable Applications" (no author listed)
Copyright 1997,
Paperback - 450 pages
Microsoft Press;
ISBN: 1572315849
(This book has been available online on MSDN in the past, but it is cheap enough for those of you who prefer real books you can hold in your hand. Defines the practical interfaces and functions that the automation libraries provide you, but is more of a reference book then a "user's guide")
Microsoft Developers Network <http://msdn.microsoft.com>
"Professional Visual C++ 5 ActiveX/Com Control Programming" Sing Li and Panos Economopoulos Copyright April 1997,
Paperback - 500 pages (no CD ROM, files available online)
Wrox Press Inc;
ISBN: 1861000375
(Excellent description of activeX control and control site interfaces. A recent review of this book on Amazon.com stated "These guys are the type that want to rewrite the world's entire software base in assembler." Need I say more?)
"sean's inconsequential homepage <http://www.eburg.com/~baxters/>"
(Various hardcore articles on low-level COM and ATL techniques. Coded in C++)
"COM in Assembly Part II" http://asmjournal.freeservers.com Bill Tyler
(no copyright noted)
Assembly Programming Journal July-Sep 99
(Basic object creation and use through COM-like interfaces)