Friday, June 7, 2013

SSIS: How to load multiple excel files into multiple SQL tables?

SQL Server Integration Services (SSIS) is a powerful ETL tool to extract, transform and load data from multiple sources. In this post we will see how we can SSIS to extract/read data from multiple excel files and load it to multiple SQL tables.

To start with let us consider we have three Excel files with different columns and rows of data into them. Our target is to read data from each of the excel file and load them in their respective SQL Table.

Let us create three SQL tables which will store the data from these Excel files. The important point to consider is we will keep the table name same as Excel file name, this is just because we want to keep this example simple and straight forward.

SQL table Products

CREATE TABLE [dbo].[Products](
    [productName] [varchar](100) NULL,
    [quantity] [int] NULL

SQL table Emp

CREATE TABLE [dbo].[emp](
    [empName] [varchar](50) NULL,
    [empAge] [int] NULL

SQL table Dept

CREATE TABLE [dbo].[Dept](
    [DeptCode] [varchar](50) NULL,
    [DeptName] [varchar](50) NULL

Next we are creating a stored procedure which is heart of this solution. In this stored procedure we have written logic to load the data from excel files to SQL tables.

The stored procedure has one parameter @ExcelFileName in which we will pass the full qualified name and location of excel files. The stored procedure uses T-Sql function OPENROWSET to to load the data from excel file to sql table. As you know OPENROWSET function can read data from multiple sources such as excel csv, text files, sql tables, oracle etc.

In the Stored procedure we have extracted the table name from the supplied excel file; this is because we have kept the table name same as Excel file name.

If you have never used OpenRowset before you need to consider two points.

  • This function is not available by default in SQL Server. To enable this function you have to enable the ad hoc distributed queries. You can run the following statement in Query Analyzer to enable this functionality.

EXEC sp_configure 'show advanced options', 1
EXEC sp_configure 'ad hoc distributed queries', 1

In case you are getting error after executing the above statement, I would recommend you to visit this post written by Michael Aspengren to resolve your error message.

  • After you successfully execute this statement, you need to ensure that the SQL Server services are running under Local System account or at the minimum the account that is running your SQL Server services is part of administrator group. If not you can change that and restart your SQL Server services. If your SQL Server is already running under Local System account you can skip this step 2.
    @ExcelFileName nvarchar(100)

DECLARE @stringSQL nvarchar(500)
DECLARE @SQLTableName nvarchar(500)
   Extract Table Name from Excel sheet
    SELECT @SQLTableName=reverse(@ExcelFileName)
    SELECT @SQLTableName= reverse(SUBSTRING(@SQLTableName, CHARINDEX('.',@SQLTableName),((CHARINDEX('\',@SQLTableName))-CHARINDEX('.',@SQLTableName))))
    SELECT @SQLTableName=SUBSTRING(@SQLTableName,1,LEN(@SQLTableName)-1)
    SET @stringSQL=@stringSQL + CHAR(39) + 'Microsoft.Jet.OLEDB.4.0' + CHAR(39) + ',' + CHAR(39) + 'Excel 8.0;Database=' + @ExcelFileName + ';' + 'HDR=YES' + CHAR(39) + ','
    SET @stringSQL=@stringSQL + CHAR(39) + 'SELECT * FROM [Sheet1$]' + CHAR(39) + ')'
    EXECUTE sp_executeSQL @stringSQL


Coming back to SSIS, in the BIDS designer or Visual Studio if you are using SQL Server 2012 we will star with creating a variable of string data type. So, I have created variable strSourceFile. This variable will hold the excel file name and location.

On the Control Flow tab we will start with dragging a ForEach Loop container and an Execute SQL Task inside the container.

We will double click on ForEach Loop task to open the ForEach Loop editor window. We will set the Enumerator as Foreach file enumerator. Next we will select the folder where our excel files exist and we will select Fully Qualified Name from Retrive file name section.

Next we will click on the VariableMapping and select our variable that we defined in SSIS in the above steps.

Next we will double click on Execute SQL Task and open the Execute SQL Task Editor window. We will select ResultSet as None, Connection as our SQL connection manger. We will write EXECUTE StoredprocedureName ? in the SQL Statement. We will select IsQueryStoredProcedure as False.

On the Parameter Mapping section will add one parameter. We will select our variable name that we defined above and keep parameter name and parameter size as 0.

That is all in the SSIS design. Let us move to run our package. I have run the package and it run succefully.

Next I move to see the SQL tables and as expected all records we loaded in the SQL tables.

SQL Table Dept, Emp and Products

So this is how we can load multiple excel files into multiple SQL table. I have shown one way to do this. There could be other ways solving this issue.

I hope the execution concepts are clear. Do let me know if you have any question or comment on this post.

No comments:

Post a Comment

Please feel free to write your Comment here

Popular Posts

Real Time Web Analytics