## LINQ – IQueryable, Mathematical Methods, and .Take (Top N)

Last Friday, I discussed some of the transformation methods like Distinct, GroupBy, and Counting using LINQ. Today, let’s take a quick look at four of the mathematical functions that are pretty common when querying: Sum, Average, Min, and Max. We’ll also do a bit of logic processing using Contains.

In our example dataset, we have an Employees table with Name, Email, and most importantly for this example, Salary. Salary is a Nullable double. For those not familiar with the ORM of LINQ-to-SQL, a nullable type is used for any column/type that has the “Can be null” attribute within the database. It can be overridden in the .designer.cs file or using the GUI, if necessary, though I’d recommend just changing the data source if possible (ala: thinking ahead and not doing this months down the road with pre-existing data).

EmployeesDataContext db = new EmployeesDataContext();

First things first, we open up a data context connection.

Now, at this point, we could query Sum the salary information with the following code:

Console.WriteLine(“Sum of Salaries : {0}”,

db.Employees.Select(i => i.Salary).Sum());

However, if we plan to use that salary information (query) again, we have to keep repeating the Select statement—a DRY (don’t repeat yourself) violation.

We can create an IQueryable object, however, to store our query and apply additional queries on.

**IQueryable**<double?> salary = db.Employees.Select(i => i.Salary);

or, if we’re lazy, we** could** take advantage of the new ‘var’ and use that

var salary = db.Employees.Select(i => i.Salary);

Those are interchangeable. If you hover over the var, it displays that it’s an IQueryable type of Employee. I suppose it’s a coding preference and standardization. Whichever you pick, just stick to it.

__Mathematical Methods__

From here, we can continue our various methods. Each of these return a value based on your reference type. Since we’re using a double? (for precise salary information) that’s what we get back.

For **Sum**:

Console.WriteLine(“Sum of Salaries : {0}”,

salary.Sum());

For **Average**, we can also do a bit of selecting. For example, returning the employees with a salary greater or equal to the average salary.

Console.WriteLine(“Avg of Salaries : {0}”,

salary.Average());

var average = db.Employees.Average(i => i.Salary);

var wellPaidEmployees = db.Employees.Where(i => i.Salary >= average);

For **min** and **max**, quick methods to return the values.

Console.WriteLine(“Min of Salaries : {0}”,

salary.Min());

Console.WriteLine(“Max of Salaries : {0}”,

salary.Max());

Finally, a bit of applied logic using the .**Contains**, which we’re used to from our Generics.

Console.WriteLine(@”Results contain values >= 100000?

If so, how many? : {0} / {1}”,

salary.Contains(new double?(100000)),

salary.Count(i => i >= 100000));

I used a hardcoded Nullable double for the 100000, a constant would work just as well (better, actually).

A final note regarding mathematical methods, if you put the code against the debugger, you’ll see that you can’t use the LINQ-to-SQL visualizer on it. From what I understand, that’s a good thing because these methods are being applied to the var “salary” that we created—it’s not requerying each time and we’re saving trips to the database! You can get a feel for the calculations by placing the methods into your own vars (rather than inline computation.

var sum = salary.Sum();

var average = salary.Average();

var min = salary.Min();

var max = salary.Max();

__.Take() — TopN’s__

The last part of the equation is TopN, a very common database query routine. As far as I’ve found, there isn’t directly a “TopN” or “BottomN” method, but you can recreate it combining .OrderBy/.OrderByDescending and .Take(N).

For example, if we want to get the Top 2 of our entries (the two highest salaries), we’d want to order our results decending… and then “take” the first two. Makes sense.

var topTwoPaid = db.Employees.OrderByDescending(i => i.Salary).Take(2);

Take generates an IQueryable object, or var, that can be iterated through. If anyone’s found an “official” TopN method, lemme know. To get the Bottom N, you’d use OrderBy rather than OrderByDescending.

To verify, if we look at the generated tSQL:

SELECT **TOP 2** [t0].[EmployeeId], [t0].[Name],

[t0].[EmailAddress], [t0].[Salary]

FROM [dbo].[Employees] AS [t0]

ORDER BY [t0].[Salary] DESC

There we go!

Nice article about linq, it explains a lot. I’m wondering however, how you can select more than one mathematical expression from a single linqquery f.example a sum and an average

@Valdimar-

Once you have a LINQ query created, such as (using the example above):

var salary = db.Employees.Select(i => i.Salary);

Now, the salary object (a new .NET 3.5 var) contains the logic to select employee salaries. We have, at this point, simply told it we want:

SELECT Salary FROM Employees

Due to the delayed queries of LINQ, we haven’t actually queried anything or touched our database at this point–only when we actually NEED the data.

From there, LINQ implements numerous extension methods, such as .Sum and .Average.

Therefore, we can take our ‘salary’ object and call those extensions:

var sum = salary.Sum();

LINQ then generates the approprate SQL code, for a sum in this case, and populates the ‘sum’ object.

var average = salary.Average();

For this example, we’re still interested in just the salaries, but we’re averaging–.Average is another extension method that LINQ is familiar with and can output for us.

In this example, our LINQ expression simply specfies the data set which we wish to work with–not the actual mathematical methods.

HTH.

-dl

wow that was fast.

Thanks, that explains a lot