Tutorial: How to automate importing data from Excel into a database using a macro (VBA)

Steve Sohcot
6 min readAug 7, 2021

--

Often I need to import data from MS Excel into a database (specifically, I’m using MS SQL Server).

I have two go-to methods for this:

  1. MS Access via ODBC if it’s not too much data (a simple copy/paste)
  2. Use the Import and Export Data Wizard if there’s a lot of data

But there are some cases where I could use a more automated approach; especially if I’m passing off this process to a non-technical coworker. As such, I’ll occasionally create an Excel macro that will import the data from a spreadsheet into a database.

Tip: This assumes your data is not in a pivot/cross tab format. If it is, check out my tutorial on how to convert the format such that the data is database import-friendly.

The Setup

Start with creating an Excel Macro. From the “Developer” tab in the ribbon (it may not be there by default; you can add it in from the Options menu), click the button for “View Code”.

If you’re on a Windows machine, the keyboard shortcut is Alt + F11

From the top menu: Insert >> Module

In the new pane (where you can type code), create a public function:

Public Function ImportData()End

Tip: if you type “public function ImportData” and hit ‘Enter’ then the rest will prepopulate; i.e. it’ll add the closing parenthesis and the keyword ‘End’

Be sure to save your file with an .xlsm extension, as this will be a macro-enabled workbook (not just a “regular” workbook).

Concept Overview

We’ll be running an INSERT statement, where the SQL follow will follow this format:

INSERT INTO myTableName 
(field1, field2)
VALUES
(row1Value1, row1Value2),
(row2Value1, row2Value2)

Pretty self-explanatory. We’ll break this down into two pieces:

  • SQL Header (INSERT INTO myTableName)
  • Dynamically generated “values” that is a concatenation of the individual rows of data within the spreadsheet

Let’s write some code!

First we need to create a connection string. I’ll leave the exact value up to you. You’ll of course have to be sure that the user has the appropriate database security permissions. The example below is using Active Directory — so it’s detecting my Windows login and not using a local SQL user account.

I’m connecting via ODBC. If you don’t have this on your machine, download the driver.

Given my use-case, my connection string looks like this:

Const strDbConn As String = "server=myServerName;Database=myDatabaseName;Trusted_Connection=Yes;Driver={ODBC Driver 17 for SQL Server}"

I’ve created this as a global variable, placed above the declaration of the function we just made.

As a global variable, we can use it within other functions

I have a simple function to execute SQL:

ADODB Error?

The code in this post works for me without the need to add an external reference. But if you’re getting an error…

In the past- perhaps it was previous versions of Excel- I needed to add in a reference in order to use the ADODB code.

You can do this by going to Tools >> References

According to this article, you need to check off the option for Microsoft ActiveX Data Objects [x.x] Library.

Inserting data with SQL: the header

We need to create the beginning part of the SQL statement, which will be static (i.e. not changing).

Create a function that returns a hard-coded result for the first part of the INSERT statement for the given table and field names:

Of course you’ll have to update it to reflect your table name and fields

Inserting data with SQL: the actual data

We need a way to dynamically generate the values that will be inserted into the database. Before we create the SQL, we need a way to loop through the rows in Excel, from the top to the bottom. The first step in that is to determine how many rows we have.

I’m using the function below get the number of rows in the worksheet.

I’m declaring it as type Long because an Integer wasn’t big enough for my dataset. An integer is limited to ~32,000 rows.

As we loop through every row, we’re going to create a string that takes the values of the individual columns and adds it to the SQL statement. The code for that is here:

You’ll of course have to align your column letters with the right field names and update the data types. For example, numeric values should not be surrounded in quotes.

I have two “helper” functions; one for text and one for numbers:

If the value is text-based, we need to replace (“escape”) a single quote, so I have this function:

I also created a nz() function (as in “null-zero” which is built-in for some programming languages) which will replace numeric values with “0” if needed:

Here’s the code to loop through each row in the spreadsheet, and generate the SQL statement:

The step-by-step breakdown is:

  • Call a function that passes in the name of the worksheet (tab) with data to find how many rows are in the data set
  • Create a variable that holds the beginning of the SQL statement
  • Loop through each row in the dataset, creating the INSERT … VALUES SQL statement
  • After each row of data, add in a comma
  • Remove the last comma in the statement
  • Run the SQL

I took the approach of adding in a comma after every row, and then just removing the last comma in the string. I’m using this function:

How to overcome a data import limitation

I ran into a problem of, sometimes the concatenated “values” got too long, and it crashed Excel. I figured I should limit the script to import in 500 rows at a time.

Here’s the updated code:

You can see that (on line 24), as the variable is generated that stores the values, once it reaches 500 characters it inserts the data- that we have so far- into the database. Then after the loop, it inserts the ‘final’ batch (row 39).

I left some code in there, commented-out, that notifies the user of the progress for how far along the import process is.

Edit: in the final code, there’s another check for only running the last batch of code if needed. If you’re trying to import in a number of rows divisible by 500, you’ll get an error. It’s on line 40 in the “entire code”.

Putting it together

To make this even more useful, we may need to first remove any existing data.

While you may be able to derive a “last date updated” from the dataset, this example code shows an additional SQL statement to run an UPDATE query to indicate the date of the import.

I’ve created a subroutine called autoImport which does this:

  • Deletes the existing data
  • Runs the function that imports the data
  • Runs an update query to indicate when the data was updated

The Entire Code

Bonus: remove blank values with VBA Code

I wrote a separate article on how you can potentially speed up the import process: removing some rows that do not add value (literally and figuratively). This could potentially significantly reduce the number of rows, thereby reducing the processing time!

In my specific use-case, I have a table that I run aggregated queries off of a single field. If that field has a value of “zero,” then I don’t need to import it in.

--

--

No responses yet