开发者

Querying an IEnumerable for objects with like attributes and within a certain time threshold

开发者 https://www.devze.com 2023-03-13 18:11 出处:网络
I have an IEnumerable full of objects that we are using to represent user actions. This is for the ultimate goal of displaying a list of the most recent actions taken in the system. This list can get

I have an IEnumerable full of objects that we are using to represent user actions. This is for the ultimate goal of displaying a list of the most recent actions taken in the system. This list can get rather long, and the users have requested a 24 hour period for the list. I want to perform some "squashing" on this list somewhat like what Facebook does for likes and comments. For example instead of listing all 37 updates a specific user performed I can list that user x updated 37 y.

These objects have the username and the datetime for the action taken as an attribute, so that information is easy enough to select. I need some help with the best way to programmatically determine what should be squashed. Ideally I am thinking for example if 1000+ people are updated in our system in less than 10 minutes by the same user then its an import and not a manual edit, and I will remove those from the list of actions and replace it with "so and so ran an import"

How would I query an IEnumerable for the objects with the same username and within a specific date range?

Edit: The only thing I am able to initially think of is iterating over the Enumerable for each possible user and for each possible 10 minute time period. That just sounds horribly inefficient though, and I'm clearly just i开发者_StackOverflow中文版gnorant of the options available.


If you can use Linq you can do a GroupBy on the user name which will group all items by the user's name then you just have to pull out two lists of data based on your desired time threshold.

Lets say you have a list of objects like this

void Main()
{
    DateTime threshold = DateTime.Now.AddMinutes(-10);

    IEnumerable<UserAction> unfilteredActions = new List<UserAction>
    {
        new UserAction { Action = "INSERT", UserName = "Craig", ExecutedOn = DateTime.Now.AddMinutes(-15) },
        new UserAction { Action = "UPDATE", UserName = "Craig", ExecutedOn = DateTime.Now },
        new UserAction { Action = "DELETE", UserName = "James", ExecutedOn = DateTime.Now }
    };

    var userActions = unfilteredActions.Where(action => action.ExecutedOn > threshold).GroupBy(k => k.UserName);
}

public class UserAction
{
    public string Action {get;set;}
    public string UserName {get;set;}
    public DateTime ExecutedOn {get;set;}
}

You could also work with the grouping like

foreach (var grp in unfilteredActions.GroupBy(k => k.UserName))
{
    foreach(UserAction action in grp.Where(a => a.ExecutedOn > threshold))
    {
        Console.Out.WriteLine(String.Format("{0} {1} {2}", action.UserName, action.Action, action.ExecutedOn));
    }
}


As it turns out I was approaching this problem incorrectly. After attempting to query the dataset in different ways using LINQ I realized that this was an AI problem. I was trying to identify groups of data within a large dataset on a per user and time basis.

This is a clustering problem. I have written and published a library to perform K means clustering on objects in an IEnumerable. The process goes a little something like this:

var clusters = SharpLearning.Clustering.KCluster(k, iterations, listOfIClusterableObjects);

foreach (var cluster in clusters) {
    // Process some data.
    // clusters is a List<Cluster<T>> where your objects can be viewed in the .Members attribute
}

The Cluster class containing two distance algorithms, the IClusterable interface and the KCluster algorithm are all provided in the C# Machine Learning Library

0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号