Archive for October, 2015

Sort a file on two fields and get unique record of the most recent date.

Example data:-
104,Teradata,10/21/2015
103,SQL Server 2008,10/21/2015
101,Netezza 2010,10/20/2015
102,Oracle,10/21/2015
101,Netezza 2015,10/25/2015
103,SQL Server 2000,10/18/2015

Desired output:-
104,Teradata,10/21/2015
103,SQL Server 2008,10/21/2015
102,Oracle,10/21/2015
101,Netezza 2015,10/25/2015

Script to achieve the above requirement:-

#-----As our exmaple file don''t have header add one as below----------#
 $Headers = 'A','B','C'  #One letter for each column!
 $FileLocation = ".\ASSET_POST_DT.txt" #Change to your location!
 #$FileFilter = "all*.csv"     #Change to your filename filter
$FileList = Get-ChildItem $FileLocation #-Filter $FileFilter
ForEach ($FSpec in $FileList) {
 #--- Load File as CSV and add Headers to allow Sorting ---
 $x = import-csv -Path $FSpec.FullName -Delimiter ',' -Header $Headers
#----Sort the data on the columns A and B and remove duplicates (This works when #to sort and to remove duplicate are on the same columns)
 $x | select A,B,C | sort-object -Property A,B -Unique | Export-csv -Path $FSpec.FullName -Delimiter ',' -Force -NoTypeInformation
#----Sort the data on the columns A and B and remove duplicates (this works when #to sort and to remove duplicate are on different columns)
 $x | Group-Object A | foreach-object { $_.group | sort-object C | select -last 1} | Export-csv -Path $FSpec.FullName -Delimiter ',' -Force -NoTypeInformation
# Reload File as Text to delete Headers and quotes around values
  # that were added by the CSV processing!
 $a = (get-content $FSpec.FullName)
 $a = $a[1..($a.count - 1)]           #--- Get Rid of Header ---
 $Final = $a.replace('"',$null)       #--- Get Rid of " ---
 Set-Content $FSpec.FullName $Final
 }