5 Effortless tricks to handle Duplicates in Excel
This is going to be the most interesting article for SAP consultants who handle duplicates in excel on daily basis. I know that you are thinking about the “Remove Duplicate” function in excel. But this article is far beyond that. There is an additional “Bonus tip” at the end of this article. Many times new SAP consultants lose their valuable time and dynamism in handling huge data and finally fail to provide the necessary information to their clients. This cost their reliability among their team mates and peers. People spend months to understand different functionalities of excel but do not find the right place to learn it. They learn it by looking at someone computer screen. Read this article and improve your knowledge and efficiency in excel. Be independent and innovative at your work place. These tricks will help you in your daily SAP consulting work. Whether you are a technical or functional consultant you must know the application of various excel functions and their limitations while handling duplicate data. There are chances that you already know few of the functions. This article will help you to revise them and consolidate them at a single place. You can add this web page to your favorites for future reference. Please share this article on social network.
Why it is important to know these tricks
Before I start writing about the proven tricks to handle duplicates, you must know why it is important to know these valuable tricks. When we get some transaction data (PO, Sales Order or Accounting data etc.) to upload in system during implementation, cut-over or data migration project, we have to ensure that all the records uploaded successfully. Before that we must know that whether all the relevant master data such as material master, info records, equipment already exist in system or not. If master data is missing then transaction data upload will fail for sure.
Suppose there are 200 Purchase orders and every PO has more than one PO item say total 850 PO items and each PO item has a material. Before uploading PO you must check whether all materials already exist in system or not. Now before we check the materials existence in MARA, MARC tables we should remove the duplicates to get the accurate count of materials in PO items. Once we remove duplicates we can check the materials in MARA and MARC table and check how many materials are not uploaded in system. Then we can take the necessary action accordingly.
[success]Information about duplicates empowers us to provide the “exact statistics” about the data we are using in our excel sheet.[/success]
So finally here are five essential tricks to handle duplicates:
1. Remove Duplicates: Remove duplicates is a new, innovative and neat function introduced by Microsoft in MS excel version 2007. This function removes duplicates instantly without any second thought. I use this function daily, almost in every MS excel sheet. In the early versions of MS office this function was not provided explicitly. I know that you are thinking that even a school going kid knows this function in MS excel. Then why I have included it here and now you want to close this web page…..
There are certain limitations associated with this function and it is essential to know them:
[plain] Limitations in “Remove Duplicate” function:
a. You cannot show the duplicates value explicitly in the data because this function removes duplicates, it does not separate out duplicates. If your client wants to know which data are duplicates, then this function cannot help.
b. If you want to prepare the list of unique values without disturbing the original data then this function cannot help because it removes the duplicates from original data.
c. If you want to count the number of duplicates for “each record” then this function cannot help.[/plain]
Before we go to other important excel functions please have a look on the use of Remove Duplicate function in excel in below screen shots.
In this way you can handle Duplicates in excel using “Remove Duplicates” function.
2. Conditional formatting: If you want to show the duplicate values in the excel sheet to you client before removing them then Conditional formatting is used in excel sheet. Many time you want to see the duplicate values in a excel sheet in such situations you can make use of this function. First select the desired column and sort the data. Then go to conditional formatting button in the option a shown in the figure then go to Highlight Cell rules then Duplicate values. Afterwards you can choose the color of your choice.
[success]Conditional Formatting → Highlight Cell Rules → Duplicate values → Select color to highlight Duplicate values[/success]
This function is very powerful tool to make a good looking excel sheet where duplicate values are highlighted explicitly.
But this function also have some limitations.
[plain] Limitations with Conditional Formatting Function:
a. If you want to prepare the list of unique values without disturbing the original data then this function cannot help because it highlight the duplicates in original data it does not remove them or does not take unique values to another place. (Read next point on Advance filter).
b. If you want to count the number of each duplicates in the data then this function cannot help. (Read point 5 of this article) [/plain]
In this way you can handle Duplicates in excel using “Conditional Formatting” function.
3. Advance Filter: Advance filter in excel is used to address the limitation of disturbing the original data. We can use the Advance filter in excel to remove the duplicate data without disturbing the original data set and copy the unique values somewhere else in the same sheet. Please refer below screen shots to understand the functionality. Here we cannot highlight the duplicate values but we can bring the unique values in the data set without disturbing the original data. Please see the step by step process with screen shot to understand advance filter in excel.
Here we have two limitations:
[plain]Limitation with Advance filter:
a. We cannot count the number of each duplicate value individually.
b. We cannot get the proper statistics for overall duplicate values, which we can get in “Remove duplicates” function.[/plain]
In this way you can handle Duplicates in excel using “Advance filter” function.
4. Pivot table– Pivot table in excel is an amazing function with unlimited functionalities. One of the functionality is that it can count the number of same values in a column. Please see below step by step process to understand the function of pivot table and how it counts the duplicate values.
In this way you can handle Duplicates in excel using “Pivot table” function.
5. Exact Function: EXACT function can also be used to identify duplicates, however, this function can only identify duplicates. It is an instant function one can use to check duplicity. To use this formula first we have to sort the required column and we compare the values please refer below step by step process to learn this function. EXACT is case-sensitive but ignores formatting differences. Use EXACT to test text being entered into a document.
In this way you can handle Duplicates in excel using “EXACT” function.
6. Using ‘If’ formula: If formula works like “EXACT” formula. In this formula also we sort the desired column and compare the values in ascending order vertically. If the values are same then the values are duplicate or else different. This is an old methodology and we used it when we did not had “Remove duplicate” function in MS excel.
In this way you can handle Duplicates in excel using “IF” function.