Merging lists to create a single unique list
If you’ve ever had to combine two or more lists in Excel so that
you get a unique list from the result, you’ll know that it can be a
tedious job. A common example is two address lists with the same
layout, and overlapping data. This article shows three ways to achieve
the final result. They are:
- You can download a zip file with 3 sample
files, here.
Using a key field to get a unique list
To get a unique list using a key field, we need to go through the
following steps:
- Copy both lists to another sheet. That way, you don’t affect
the original data - Create a formula that identifies duplicates. You may need to
create a calculated column that combines two or more columns, so
that you have a unique key; if so, you will have two calculated
columns. The first will be to create the key, and the second
will identify duplicates. - Filter the list for duplicates, and delete these rows.
Try it out
In the file download is a workbook called MergeLists.xls. Open it
up, and you will see that Sheet1 and Sheet2 each have a list of
names and addresses. Both lists have the same layout, which
simplifies the process for us. There are only 18 names in the two
lists, two of which are duplicated, but the process is similar for
lists with thousands of names.
The list on Sheet1 looks like this: |
Create a key
Name and address details can be difficult to identify as unique.
There can be duplication in any of the name or address fields
without duplicating the whole record, so it’s best to create a key
that uses all of the available fields. In G2 of Sheet1, enter this
formula:
=A2&B2&C2&D2&E2&F2
Copy and paste the formula down column G to create a unique key
for each address record. Do the same on Sheet2. It isn’t pretty but
it doesn’t need to be. Once you have the combined list you can
delete the additional columns.
Check for duplicates
In cell H2 of Sheet2, place this formula:
=COUNTIF(Sheet1!$G:$G,G2)
Fill down — records in this sheet that do not have a match in
Sheet1 will have a 0 in Column H. If there are duplicates, you will
get a value higher than 0.
Filter the unique records
Still on Sheet2, do the following:
- Type Key in G1
- Type Count in H1
- Autofilter the list for 0 in Column H.
- Copy the filtered records across to Sheet1, and paste them
below the existing list.
You should now have a list of 16 addresses on Sheet1, with no duplicates. This image shows the final result. |
Using a query to extract unique data
The second option is to use the power of a database query to
extract the data into a unique list. The SQL language can combine
two or more lists into a single, unique list with a UNION query.
This approach requires the following steps:
- If the lists are in different workbooks, copy them to a new
workbook so that they are on Sheet1 and Sheet2. This has the
advantage that you won’t need to change any of the code. If you
did create a new workbook, save it. - Paste this code into a new module in the workbook.
''================================================ ''This code combines two tables on separate sheets ''into a third worksheet in the same workbook. ''Requires: A reference to the Microsoft ActiveX ''Data Objects 2.x Library, where x is a number ''between 1 and 8. ''Assumptions: The two starting lists are on Sheet1 ''and Sheet2. The combined list will go on Sheet3. ''You will need to change the sheet references in ''this code to reflect the names or your worksheets. ''Created: 29 November 2007 by Denis Wright ''================================================ Sub ADO_Merge_Sheets() Dim cnn As ADODB.Connection Dim rst As ADODB.Recordset Dim sSQL As String Dim sPath As String Dim MyConn sPath = ActiveWorkbook.FullName 'Define the SQL statement that merges the sheets. sSQL = "SELECT * FROM [Sheet1$]" sSQL = sSQL & " UNION SELECT * FROM [Sheet2$]" 'Establish connection to the same file 'When connecting to Excel instead of a database, you need to define 'the extended properties as Excel 8.0 (The first Excel version to use ADO) MyConn = sPath Set cnn = New ADODB.Connection With cnn .Provider = "Microsoft.Jet.OLEDB.4.0" .Properties("Extended Properties").Value = "Excel 8.0" .Open MyConn End With 'Define a recordset based on the SQL statement Set rst = New ADODB.Recordset rst.CursorLocation = adUseServer rst.Open Source:=sSQL, _ ActiveConnection:=cnn, _ CursorType:=adOpenForwardOnly, _ LockType:=adLockOptimistic, _ Options:=adCmdText Application.ScreenUpdating = False 'Delete existing data on the destination sheet, then 'transfer the results of the latest filter, starting at cell A2. 'When done, clean up references to avoid memory leaks. With Sheets("Sheet3") 'Sheet3 is the DESTINATION sheet .Range("A1").CurrentRegion.Clear .Range("A2").CopyFromRecordset rst End With rst.Close cnn.Close 'Copy the headings across from Sheet1 and Autofit the final list. Sheets("Sheet1").Range("A1").EntireRow.Copy _ Destination:=Sheets("Sheet3").Range("A1") Application.CutCopyMode = False Sheets("Sheet3").Range("A1:F1").EntireColumn.AutoFit Application.ScreenUpdating = True End Sub
Try it out
In the sample files provided with this example is a workbook
called MergingLists_Code.xls. It already has the lists in Sheet1 and
Sheet2, and a button on Sheet3 that runs the code.
The code-based method is much quicker to use than the
formula-based one. If you are starting to experiment with ADO code
in Excel, getting an understanding of SQL can save you a lot of
effort.
How it works
SQL is a language common to all relational databases. If you
build a query in Access, and then look behind the scenes, you will
find that the graphical layout has created a SQL statement to do the
work of selecting, grouping, and sorting your data. ADO is a generic
technology that brings SQL capabilities (among other things) to
Office applications like Excel. With ADO you can interact with a
whole range of database systems, including (in this case) Excel
files.
The SQL statement in the code strings together to produce this:
SELECT * FROM [Sheet1$] UNION SELECT * FROM [Sheet2$]
A SQL SELECT statement requires you to select one or more tables
or queries to pull your data from. When you are using a worksheet as
the data source, enclosing the sheet name in square brackets (with
the $ symbol) causes SQL to treat the worksheet as a table.
The UNION operator will combine 2 or more tables with the
same structure into a single list. You need to ensure that:
- The tables / sheets have the same number of fields (data
columns), and - Those fields are in the same order in all of the lists.
If the fields are not in the same order you will get an error and
the code will fail or (possibly worse) create a garbage result.
Creating a unique list with
Advanced Filter
The third approach is also quite easy to use. The steps are:
- Copy and paste to create a starting list with all the
records, including duplicates. - Use Advanced Filter to extract the unique items
Try it out
In the sample files, open MergeLists_AdvFilter.xls. Sheet1
contains 16 names, and Sheet2 a further 9 names.
Combine the lists into one list with all records
Copy the lists from Sheet1 and Sheet2 onto Sheet3, so that you
have a single list with one set of headings.
Extract the unique records
Place your cursor in the list and then follow the instructions
below for your version of Excel.
Excel 2007
On the Ribbon, click the Data tab. The third group of icons is
Sort & Filter. In this list, select Advanced Filter.
Earlier versions
Select Data > Filter > Advanced Filter.
All versions
The Advanced Filter dialog will display. The screen shot is for 2007, but the layout for earlier versions of Excel is virtually identical. |
|
Make sure that you select Unique Records Only in
the bottom left corner of the dialog, then click OK. The list will
filter down to show all of the unique records. You can now select
the table, and copy the new list to another sheet.
Clear the filter
Once you have copied the unique list somewhere else, return to
Sheet3 and undo the filter.
Excel 2007
In the Data tab of the ribbon, just above the Advanced button is
Clear. Push this to clear the filter and redisplay
all rows.
Earlier versions
Select Data > Filter > Show All to remove the
filter.
Wrapping up
This article shows three different approaches for combining two
or more lists into one. There is always more than one way to get the
job done. These tips should increase your options if you need to
tackle a chore like this.