# CS614 Assignment No 03 Solution & Discussion Due Date:Feb 16, 2015

Consider the below applicant table:

Applicant_ Info

.Question:

Apply all three steps of Basic Sorted Neighborhood (BSN) method to find out the duplicate records in the table. Records will be considered duplicate if the value of “Applicant_id

” column is same in these records.

Use the following rules for the key:

Key:

Key will consist of first three characters from “Applicant_id”, then first three characters from “Applicant_Name” and then first two characters from “father_Name” column.

BSN method comprises of three steps given below:

a) Create key

In step-1, you will create the key according to the rules as mentioned above against each record. For this, you can add extra column at the end of the table to show the new key created against each record.

b) Sort the data

In step-2, you will sort the record on the basis of newly created key of step-1.

c) Merge

In step-3, consider the window size (w) equal to two (2). You are required to identify the similar records on the basis of sorted key.

.

### Replies to This Discussion

 CS2025 umair ikram PHD Islamabad CS2 uma ik MG2026 waseem maqsood MBA Faisalabad MG2 was ma MH2027 faizan jawad MSc Lahore MH2 fai ja MG2026 waseem maqsood MBA Faisalabad MG2 was ma EN2028 Atif sohail BS Multan EN2 ati so MH2027 faizan jawad MSc Lahore MH2 fai ja CS2029 Tariq Ali MSc Karachi CS2 tar al

Step # 2 Sorted Key

 Applicant- ID Applicant- Name Father- Name Qualification Address Key Sorted Duplicate Record CS2025 umair ikram PHD Islamabad CS2 uma ik CS2029 tariq ali MSc Karachi CS2 tar al EN2028 atif sohail BS Multan EN2 ati so MG2026 waseem maqsood MBA Faisalabad MG2 was ma MG2 was ma MG2026 waseem maqsood MBA Faisalabad MG2 was ma MG2 was ma MH2027 faizan jawad MSc Lahore MH2 fai ja MH2 fai ja MH2027 faizan jawad MSc Lahore MH2 fai ja MH2 fai ja

plz discus 3rd step also

