I am trying to understand clustering algorithm trying to understand and get information about outliers...
I am trying to get the information from my audit logs that would have information like
Now few of the outliers which i want to able to report are
--One user constantly trying to make multiple attempts in a x amount of time ( lets say 10 times in 30 mins)
-- One would be same user Id coming up from different IP
-- One would be the amount time spent is more than x hours
-- if start time are off the peak hours...
My list of questions is, inorder to develop the cluster , how would i need to present the data in DSV
Do I need to aggregate the data ? Because a user can have more session it will always show up as different session ?
Do i need to create some threshold value to predict outliers on ......?
I just want to detect the anomoly in the patter of data ... ?
How would i present the data for a same user who logged in 10 times in 30 mins and have 10 different sessionid but is coming from same ip ? address
Could some one help with a direction
View Complete Post