DataWright Information Services

Consulting and Resources for Excel and Access

Merging lists to create a single unique list

If you've ever had to combine two or more lists in Excel so that you get a unique list from the result, you'll know that it can be a tedious job. A common example is two address lists with the same layout, and overlapping data. This article shows three ways to achieve the final result. They are:

  • You can download a zip file with 3 sample files, here.

Using a key field to get a unique list

To get a unique list using a key field, we need to go through the following steps:

  • Copy both lists to another sheet. That way, you don't affect the original data
  • Create a formula that identifies duplicates. You may need to create a calculated column that combines two or more columns, so that you have a unique key; if so, you will have two calculated columns. The first will be to create the key, and the second will identify duplicates.
  • Filter the list for duplicates, and delete these rows.

Try it out

In the file download is a workbook called MergeLists.xls. Open it up, and you will see that Sheet1 and Sheet2 each have a list of names and addresses. Both lists have the same layout, which simplifies the process for us. There are only 18 names in the two lists, two of which are duplicated, but the process is similar for lists with thousands of names.

The list on Sheet1 looks like this: Starting list on Sheet1

Create a key

Name and address details can be difficult to identify as unique. There can be duplication in any of the name or address fields without duplicating the whole record, so it's best to create a key that uses all of the available fields. In G2 of Sheet1, enter this formula:

=A2&B2&C2&D2&E2&F2

Copy and paste the formula down column G to create a unique key for each address record. Do the same on Sheet2. It isn't pretty but it doesn't need to be. Once you have the combined list you can delete the additional columns.

Check for duplicates

In cell H2 of Sheet2, place this formula:

=COUNTIF(Sheet1!$G:$G,G2)

Fill down -- records in this sheet that do not have a match in Sheet1 will have a 0 in Column H. If there are duplicates, you will get a value higher than 0.

Filter the unique records

Still on Sheet2, do the following:

  • Type Key in G1
  • Type Count in H1
  • Autofilter the list for 0 in Column H.
  • Copy the filtered records across to Sheet1, and paste them below the existing list.
You should now have a list of 16 addresses on Sheet1, with no duplicates. This image shows the final result. The combined list has no duplicated rows

Using a query to extract unique data

The second option is to use the power of a database query to extract the data into a unique list. The SQL language can combine two or more lists into a single, unique list with a UNION query. This approach requires the following steps:

  • If the lists are in different workbooks, copy them to a new workbook so that they are on Sheet1 and Sheet2. This has the advantage that you won't need to change any of the code. If you did create a new workbook, save it.
  • Paste this code into a new module in the workbook.
''================================================
''This code combines two tables on separate sheets
''into a third worksheet in the same workbook.
''Requires: A reference to the Microsoft ActiveX
''Data Objects 2.x Library, where x is a number
''between 1 and 8.
''Assumptions: The two starting lists are on Sheet1
''and Sheet2. The combined list will go on Sheet3.
''You will need to change the sheet references in 
''this code to reflect the names or your worksheets.
''Created: 29 November 2007 by Denis Wright
''================================================

Sub ADO_Merge_Sheets()
  Dim cnn As ADODB.Connection
  Dim rst As ADODB.Recordset
  Dim sSQL As String
  Dim sPath As String
  Dim MyConn

  sPath = ActiveWorkbook.FullName

  'Define the SQL statement that merges the sheets.
  sSQL = "SELECT * FROM [Sheet1$]"
  sSQL = sSQL & " UNION SELECT * FROM [Sheet2$]"

  'Establish connection to the same file
  'When connecting to Excel instead of a database, you need to define
  'the extended properties as Excel 8.0 (The first Excel version to use ADO)

  MyConn = sPath

  Set cnn = New ADODB.Connection
  With cnn
    .Provider = "Microsoft.Jet.OLEDB.4.0"
    .Properties("Extended Properties").Value = "Excel 8.0"
    .Open MyConn
  End With

  'Define a recordset based on the SQL statement

  Set rst = New ADODB.Recordset
  rst.CursorLocation = adUseServer
  rst.Open Source:=sSQL, _
      ActiveConnection:=cnn, _
      CursorType:=adOpenForwardOnly, _
      LockType:=adLockOptimistic, _
      Options:=adCmdText

  Application.ScreenUpdating = False

  'Delete existing data on the destination sheet, then
  'transfer the results of the latest filter, starting at cell A2.
  'When done, clean up references to avoid memory leaks.

  With Sheets("Sheet3") 'Sheet3 is the DESTINATION sheet
    .Range("A1").CurrentRegion.Clear
    .Range("A2").CopyFromRecordset rst
  End With
  rst.Close
  cnn.Close

  'Copy the headings across from Sheet1 and Autofit the final list.
  Sheets("Sheet1").Range("A1").EntireRow.Copy _
  Destination:=Sheets("Sheet3").Range("A1")
  Application.CutCopyMode = False
  Sheets("Sheet3").Range("A1:F1").EntireColumn.AutoFit

  Application.ScreenUpdating = True

End Sub

Try it out

In the sample files provided with this example is a workbook called MergingLists_Code.xls. It already has the lists in Sheet1 and Sheet2, and a button on Sheet3 that runs the code.

The code-based method is much quicker to use than the formula-based one. If you are starting to experiment with ADO code in Excel, getting an understanding of SQL can save you a lot of effort.

How it works

SQL is a language common to all relational databases. If you build a query in Access, and then look behind the scenes, you will find that the graphical layout has created a SQL statement to do the work of selecting, grouping, and sorting your data. ADO is a generic technology that brings SQL capabilities (among other things) to Office applications like Excel. With ADO you can interact with a whole range of database systems, including (in this case) Excel files.

The SQL statement in the code strings together to produce this:

  SELECT * FROM [Sheet1$] UNION SELECT * FROM [Sheet2$]

A SQL SELECT statement requires you to select one or more tables or queries to pull your data from. When you are using a worksheet as the data source, enclosing the sheet name in square brackets (with the $ symbol) causes SQL to treat the worksheet as a table.

The UNION operator will combine 2 or more tables with the same structure into a single list. You need to ensure that:

  • The tables / sheets have the same number of fields (data columns), and
  • Those fields are in the same order in all of the lists.

If the fields are not in the same order you will get an error and the code will fail or (possibly worse) create a garbage result.

Creating a unique list with Advanced Filter

The third approach is also quite easy to use. The steps are:

  • Copy and paste to create a starting list with all the records, including duplicates.
  • Use Advanced Filter to extract the unique items

Try it out

In the sample files, open MergeLists_AdvFilter.xls. Sheet1 contains 16 names, and Sheet2 a further 9 names.

Combine the lists into one list with all records

Copy the lists from Sheet1 and Sheet2 onto Sheet3, so that you have a single list with one set of headings.

Extract the unique records

Place your cursor in the list and then follow the instructions below for your version of Excel.

Excel 2007

On the Ribbon, click the Data tab. The third group of icons is Sort & Filter. In this list, select Advanced Filter.

Earlier versions

Select Data > Filter > Advanced Filter.

All versions

The Advanced Filter dialog will display. The screen shot is for 2007, but the layout for earlier versions of Excel is virtually identical. The Advanced Filter dialog, ready to show unique records only

Make sure that you select Unique Records Only in the bottom left corner of the dialog, then click OK. The list will filter down to show all of the unique records. You can now select the table, and copy the new list to another sheet.

Clear the filter

Once you have copied the unique list somewhere else, return to Sheet3 and undo the filter.

Excel 2007

In the Data tab of the ribbon, just above the Advanced button is Clear. Push this to clear the filter and redisplay all rows.

Earlier versions

Select Data > Filter > Show All to remove the filter.

Wrapping up

This article shows three different approaches for combining two or more lists into one. There is always more than one way to get the job done. These tips should increase your options if you need to tackle a chore like this.