Sample OLE DB Provider

Created:8/06/2001
Last Modified:02/01/2002
Description:

The SampleProvider example demonstrates how to create a read-only OLE DB provider which meets all the requirements for ArcGIS, that is it can identify, query, and retrieve spatial data. This sample provider can read data from an ArcGIS personal geodatabase or Microsoft® Access .mdb file using standard ArcObjects. The Sample Provider is compliant with the OpenGIS (OGIS) OLE-COM Simple Features Specification. This means that the provider is spatially enabled, i.e., it can discover, query and retrieve spatial features. You can find complete details of the Microsoft OLE DB provider technology and of the OGIS specification at www.microsoft.com/data and www.opengis.org, respectively. Also see the article 'Implementing OLE DB Providers in ArcGIS' in the Technical Documents section of the ArcObjects Developer Help. As a consumer of OLE DB data sources, ArcGIS has a set of requirements that you must supply to make your data source available:

1. Create a basic OLE DB provider - this is done using the Active Template Library (ATL) tools provided in Microsoft Visual Studio 6.0.

2. Support the OGIS Feature Tables and Geometry Columns schema rowsets - this is needed for browsing and schema discovery.

3. Support the ICommandWithParameters interface on the Command object - this is needed to handle spatial queries.

4. Support the IColumnsRowset interface on the rowset object - this is needed to support the additional OGIS Metadata columns: GEOM_TYPE, SPATIAL_REF_SYSTEM_ID and SPATIAL_REF_SYSTEM_WKT.

5. Return your geometry objects as OGIS Well Known Binaries (WKB) - see the OGIS OLE-COM Simple Features specification for definitions of these.

Several good reference documents are listed below. These articles are included on the MSDN CD included with Visual C++ or can be found at www.msdn.microsoft.com:

1) Using the Visual C++ 6.0 OLE DB Provider Templates - Lon Fisher, Visual C++ Development Team, 1998 - required reading

2) Exposing Your Custom Data In a Standard Way Through ADO and OLE DB, Dino Esposito, June 1999 - good article

3) OLE DB Minimum Levels of Consumer and Provider Functionality - MDAC technical article - general

4) OLE DB/ADO: Making Universal Data Access a Reality - Microsoft Corp., 1998 - general.

5) OpenGIS OLE-COM Simple Features Specification - required reference (available at www.opengis.org).



The following description assumes that you have a working knowledge of Microsoft's Component Object Model (COM) technology and that you are familiar with Microsoft's OLE DB data access technology. To create an OLE DB provider, you will need C++ programming experience and should have a working knowledge of C++ templates. It also assumes that you have familiarity with Microsoft Visual C++. A complete discussion of ATL OLE DB provider templates is also beyond the scope of this document - you will need to read the articles mentioned above and do more programming to obtain this knowledge.



Steps to create a spatially enabled OLE DB provider

This sample creates an OLE DB provider that reads from ArcGIS personal geodatabases, or Microsoft Access files (.mdb), using standard ArcObjects. The steps are outlined below:

1) Create the ATL base OLE DB provider using MS Visual C++ 6.0

a) Run the ATL Com Wizard

i) Select File/New/Projects.

ii) Select Supports MFC if you want to use handy tools like CStrings, CArrays, etc.

iii) Press Finish

b) Run the ATL Object Wizard

i) Select Insert\New ATL Object

ii) Select Data Access in the Category column

iii) Select Provider in the Object column

iv) Enter the short name you want to give your provider

v) Press OK

c) This will create a MS Visual C++ project with 4 standard OLE DB COM objects for you (these are in 3 header or .h files). These objects are C++ classes implemented using the data access templates provided in ATL (atldb.h). The objects are described in detail in the OLE DB documentation but here are some simple definitions:

i) Data Source - connection to your physical database or file or database

ii) Session - the current operating environment to your data source

iii) Command - an object used to issue commands (SQL statements) and create rowsets

iv) Rowset - an object that contains rows of data organized in columns



2) Implement the Data Source object - located in the <YourProvider>DS.h file

a) This object supports the important IDBCreateSession, IDBInitialize and IDBProperties interfaces. One of its main responsibilities is to open (initialize) and close (uninitialize) your data source.

b) Look at the code for CDataSource or the C<YourProvider>Source class (CSampleProvSource in the sample provider).

c) Implement the IDBInitialize::Initialize() function:

i) In the sample this is where we create the Access WorkspaceFactory and open the Access database (workspace).

ii) We obtain the name of the database using the DBPROPSET_DBINIT property in the DBPROP_INIT_DATASOURCE property set. The data source name is set by the OLE DB consumer, in our case ArcGIS, using the IDBProperties interface. This happens when you make an OLE DB connection to this provider using the Data Link dialog.

d) Implement the IDBInitialize::Uninitialize() function:

i) In the sample all this function does is to free the IWorkspace object which connects to the Access database.

e) Optionally implement the IDBCreateSession::CreateSession() function:

i) You may want your session object to hold onto objects that the Data Source holds onto (for efficiency reasons). In the sample we wanted the Session object to also hold onto the Workspace object.

f) Note - to use ArcObjects from C++, you have to import the esriCore.olb file, which contains the actual object definitions. Look for the #import statement the stdafx.h file provided with the sample for an example of how to do this.



3) Implement the standard schema rowsets - located in the <YourProvider>Sess.h file. Schema rowsets are rows containing data about the schema of your data source. The ATL wizard creates implementations for the 3 standard OLE DB schema rowsets:

a) Tables - contains a list of all the tables in your data source. The C++ class C<YourProvider>SessionTRSchemaRowset provides the implementation for this. Remove the code that the wizard creates and add your own. Note - ArcGIS uses only the Tables schema rowset. See the CSampleProvSessionTRSchemaRowset class in the sample for an example.

b) Columns - contains a list of all the columns, and their respective tables, in your data source. The C++ class C<YourProvider>SessionColSchemaRowset provides implementation for this. Remove the code that the wizard creates and optionally add your own.

c) Provider Types - contains a list of all the data types that your data source supports. The C++ class C<YourProvider>SessionPTSchemaRowset provides implementation for this. Remove the code that the wizard creates and optionally add your own.



4) Implement the OGIS schema rowsets - add these to the <YourProvider>Sess.h file. ArcGIS uses these to discover what tables are spatially enabled and other spatial information.

a) Feature Tables schema rowset - contains a list of the Feature Classes in your data source. This is mandatory.

i) Create a C++ class to hold the schema information. See the OGISTables_Row class in the sample for details.

ii) Create a C++ class for the schema rowset itself. See the CSampleProvSessionSchemaOGISTables class in the sample for an example. Note - the Execute() function populates the actual rowset.

b) Geometry Columns schema rowset - contains a list of the geometry columns and their associated Feature Classes in your data source. This is mandatory.

i) Create a C++ class to hold the schema information. See the OGISGeometry_Row class in the sample for details.

ii) Create a C++ class for the schema rowset itself. See the CSampleProvSchemaOGISGeoColumns class in the sample for an example. Note - the Execute() function populates the actual rowset.

c) Spatial Reference schema rowset - contains a list of the spatial references in your data source. This is optional because ArcGIS doesn't use this schema rowset.

d) Add the schema definitions to the ATL Schema Map. In the Session object code, look for the BEGIN_SCHEMA_MAP section. Add your schema rowset definitions:

BEGIN_SCHEMA_MAP(CSampleProvSession)
    SCHEMA_ENTRY(DBSCHEMA_TABLES, CSampleProvSessionTRSchemaRowset)
    SCHEMA_ENTRY(DBSCHEMA_COLUMNS, CSampleProvSessionColSchemaRowset)
    SCHEMA_ENTRY(DBSCHEMA_PROVIDER_TYPES, CSampleProvSessionPTSchemaRowset)
    SCHEMA_ENTRY(DBSCHEMA_OGIS_FEATURE_TABLES, CSampleProvSessionSchemaOGISTables)
    SCHEMA_ENTRY(DBSCHEMA_OGIS_GEOMETRY_COLUMNS,CSampleProvSchemaOGISGeoColumns)
    SCHEMA_ENTRY(DBSCHEMA_OGIS_SPATIAL_REF_SYSTEMS,CSampleProvSessionSchemaSpatRef);
END_SCHEMA_MAP()
e) The GUID definitions for these schema rowsets are contained in the OleDBGis.h file, which is included in the sample. Note - the guids.cpp file is needed because these GUID definitions have to be compiled into your program or you will get link errors. Also, note the Project Settings for guids.cpp, its Precompiled Headers is set to Not using precompiled headers (under the C/C++ tab).



5) Implement the Command object - located in the <YourProvider>RS.h and .cpp files.

a) In the OLE DB specification, the purpose of the Command object is mostly for database optimization, i.e., so a SQL statement can be created once, optionally prepared (database terminology), and re-executed with optional parameters to create new rowsets.

b) The ATL wizard provides a standard Command object for which you may or may not need to modify. In the sample, the Command object holds both an ITable and IQueryFilter objects, so they can be kept on and re-used.

c) The ATL wizard creates a default Execute() function which is located in the <YourProvider>RS.cpp file. This is where special work needs to be done to process parameters, create rowsets, do conversions, etc. This is discussed below.

d) Note the following functions in the SampleProvRS.cpp file:

i) ParseCommand() - this function does some simple parsing of the SQL query set by the consumer using the ICommandText::SetCommandText() function.

ii) SetupSpatialFilter() - this function processes spatial parameters (if any) when doing spatial queries. Because we pass geometry objects as OGIS WKB to non-ESRI OLE DB providers, our sample needs to convert the WKB back to an ESRI geometry object since we are using an ESRI personal geodatabase as our data source. Your provider will probably not need to do this because your data source will mostly likely not be using ESRI geometry objects. Note - that the OGIS spatial filter operators (touches, within, etc.) are defined in OleDBGis.h file.

iii) Execute() - this function calls the above functions, sets up the query filter and creates the actual rowset object. Note - it only opens the table object if it already isn't open - this is where the re-use occurs.

iv) SetCommandText()- frees the table object when new command text is set.

e) Implement the ICommandWithParameters interface:

i) Use the ICommandWithPamametersImpl template class that is provided in the sample. Examine how the CSampleProvCommand class inherits from this template. Please note that this implementation only handles the OGIS spatial parameters and is not a complete implementation of ICommandWithParameters, however it is enough to support our requirements.



6) Implement the Rowset object - located in the <YourProvider>RS.h file

a) The ATL wizard implements a rowset that reads files from your disk:

i) A rowset is implemented primarily through the ATL CRowsetImpl template class

ii) Remove the CTestATLWindowsFile class and the code in the C<YourProvider>Rowset::Execute() function - this is where your code will go.

b) The standard way to transfer data from your data source to the consumer, via your provider, is to use a C++ class:

i) Examine the CSampleProvFeatureRowData class. For simplicity, we made the sample contain only the OID and SHAPE columns because they are present in every GeoDatabase feature class.

ii) If your data has a fixed schema, as is often the case, this mechanism will work well for you. If your schema varies from table to table, then you will need a more elaborate data transfer mechanism than the one presented here. For some ideas on doing this, read ATLMTO: Dynamically Acquires Column Definitions, MSDN.

c) Holding data in the rowset - the CVirtualArray template class

i) The standard mechanism CRowsetImpl uses is to load the entire data in a table into memory, holding it in a CSimpleArray template class. This has obvious negative consequences so we don't want to do it this way.

ii) Because ATL uses templates exclusively, all we need to do is to replace the ArrayType parameter (see CRowsetImpl) with something other than CSimpleArray. Examine the CVirtualArray template class in the sample which only holds only the current record in memory at one time. While somewhat simple minded, this implementation works well for the sample, but has obvious limitations if you want to support a scrolling rowset. Because ArcGIS does not use this scrolling rowsets and always reads all rows in a rowset, CVirtualArray works well for our usage.

iii) Note that we convert the ESRI geometry to an OGIS WKB using the utility function ESRIutil::GetOGISWkb(). We do this because this is the contract we must honor. Your provider will create the WKB from whatever format you store your geometry in.

iv) Note for simplicity sake, we took advantage of the knowledge that the standard name of ESRI geometry columns is 'SHAPE'. This will not always work in real world situations.

d) Examine the CSampleProvRowset::Execute() function - this is where the real work occurs:

i) We call the Search method on the Table object to get back a cursor.

ii) We get the number of rows in the table - the semantics that ATL uses requires that it knows the total number of rows. Care should be taken to make this as efficient as possible.

iii) Initialize the CVirtualArray object with the row count and the cursor object.

e) Implement IColumnsRowset on the Rowset object:

i) The standard OLE DB mechanism for obtaining column definitions is to use the IColumnsInfo::GetColumnInfo() function. Because this has fixed definitions for the column descriptions, the IColumnsRowset interface was added to allow for more flexible column metadata reporting.

ii) The OGIS specification added the addition metadata columns: GEOM_TYPE, SPATIAL_REF_SYSTEM_ID and SPATIAL_REF_SYSTEM_WKT, so that consumers could identify the spatial column containing the WKB geometry, what its geometry type is, and what spatial reference system it belongs to.

iii) Use the IColumnsRowsetImpl template class provided in the sample to implement this. Examine the GetColumnsRowset() and PopulateRowset() functions - you will need to replace this code in your provider. Note how CSampleProvRowset inherits this template class.

iv) The sample doesn't pass back the Spatial Reference System information (see Notes below for an explanation). To enable this, you must modify the GetDBStatus() function to indicate that these columns are not Null.



7) Notes:

a) Programming:

i) Because of naming conflicts between the OLE DB and ESRI object models, we renamed a number of ESRI objects (see the #import statement in the stdafx.h file)

ii) We try to use as many ATL conventions and objects as possible, e.g., CComPTR, CComBSTR, Impl template classes, etc.

iii) We use the <Interface>Ptr convention when declaring pointers to ESRI interfaces because this is a standard ESRI C++ programming convention, made possible by importing esriCore.olb.

iv) Because the Data Source, Session, Command and Rowset objects all work closely together, often in parent/child relationships, and because they are only exposed through COM, we made their member variables public so that each class could reference them for simplicity.

b) Implementation:

i) ESRI recommends that you create a minimum, level-0 OLE DB provider that is fully OGIS compliant but ArcGIS does not require this. To do this involves more work and can be done incrementally.

ii) The sample does not pass back the OGIS Spatial Reference System Id or the WKT because these are currently expensive to obtain in ArcObjects. Your provider should attempt to support these. Also, the sample does not implement the Spatial Reference schema rowset.

iii) The sample does not handle tables not registered with the geodatabase, that is, those that have no OID column.



How to use:
  1. Build the SampleProvider.DLL file from the project (you may have to modify the path to the esricore.olb file in the StdAfx.h file to point to your ArcGIS install directory).
  2. Make an OLE DB connection from ArcGIS using the SampleProvider. Double click on the Add OLE DB connection object in either ArcCatalog (TOC panel) or ArcMap (Add Data dialog). This will call the Data Link Properties dialog - on the Provider tab select the 'SampleProv OLE DB Provider'. Click on 'Next >>' to bring up the 'Connection' tab and enter the path to and name of a personal geodatabase or Microsoft® Access .mdb file in the Data Source field. Click on 'Next>>' to move to the 'Advanced' tab and click on 'OK'. A new OLE DB connection should appear in the TOC panel or Add Data Dialog.
  3. Modify the sample provider code to suit your data format/application.

Application:
Geodatabase

Requires: A working knowledge of C++ templates and Active Template Library (ATL) tools

Minimum ArcGIS Version Required: 8.1.2

Difficulty: Advanced


Visual C++
File Description
SampleProvider.dsp The VC++ project file for SampleProvider.
Stdafx.h Precompiled header file which imports esriCore.olb.
OleDBgis.h Precompiled header file which contains the GUIDs and constants that are required by the OpenGIS Simple Features for OLE-COM Specification.
SampleProvDS.h Precompiled header file which implements the data source object.
SampleProvRS.h Precompiled header file which implements the command and rowset objects.
SampleProvRS.cpp Implementation file for command object.
SampleProvSess.h Precompiled header file which implements the standard schema rowsets and three OGIS schema rowsets.
ESRIUtil.h Precompiled header file which returns the spatial reference and the geometry/ObjectID column names for the data set.
SampleProvider.cpp Implementation file for DLL Exports .


Key Interfaces: esricore.ISpatialReference, esricore.IGeoDataset, esricore.ITable, esricore.IQueryFilter, esricore.IField, esricore.IFields, esricore.IGeometryDef, IDBCreateSession, IDBInitialize, IDBProperties, ICommand, ICommandText, ICommandWithParameters, IColumnsRowset