Sort a file on two fields and get unique record of the most recent date.
Example data:-
104,Teradata,10/21/2015
103,SQL Server 2008,10/21/2015
101,Netezza 2010,10/20/2015
102,Oracle,10/21/2015
101,Netezza 2015,10/25/2015
103,SQL Server 2000,10/18/2015
Desired output:-
104,Teradata,10/21/2015
103,SQL Server 2008,10/21/2015
102,Oracle,10/21/2015
101,Netezza 2015,10/25/2015
Script to achieve the above requirement:-
#-----As our exmaple file don''t have header add one as below----------# $Headers = 'A','B','C' #One letter for each column! $FileLocation = ".\ASSET_POST_DT.txt" #Change to your location! #$FileFilter = "all*.csv" #Change to your filename filter
$FileList = Get-ChildItem $FileLocation #-Filter $FileFilter
ForEach ($FSpec in $FileList) { #--- Load File as CSV and add Headers to allow Sorting --- $x = import-csv -Path $FSpec.FullName -Delimiter ',' -Header $Headers
#----Sort the data on the columns A and B and remove duplicates (This works when #to sort and to remove duplicate are on the same columns) $x | select A,B,C | sort-object -Property A,B -Unique | Export-csv -Path $FSpec.FullName -Delimiter ',' -Force -NoTypeInformation
#----Sort the data on the columns A and B and remove duplicates (this works when #to sort and to remove duplicate are on different columns) $x | Group-Object A | foreach-object { $_.group | sort-object C | select -last 1} | Export-csv -Path $FSpec.FullName -Delimiter ',' -Force -NoTypeInformation
# Reload File as Text to delete Headers and quotes around values # that were added by the CSV processing! $a = (get-content $FSpec.FullName) $a = $a[1..($a.count - 1)] #--- Get Rid of Header --- $Final = $a.replace('"',$null) #--- Get Rid of " --- Set-Content $FSpec.FullName $Final }
Advertisements