LINQ (Language Integrated Query) allows programmers to query different data sources like SQL databases, XML files, and object collections without knowing the query language of each data source. LINQ provides standard query syntax and methods that work across all data sources through LINQ providers. The main types of LINQ are LINQ to Objects for querying in-memory collections, LINQ to SQL for querying databases, and LINQ to XML for querying XML files. LINQ uses deferred execution, meaning the query is not executed until the results are iterated over or accessed. This provides efficiency by not executing the query until its results are needed.
LINQ (Language Integrated Query) allows programmers to query different data sources like SQL databases, XML files, and object collections without knowing the query language of each data source. LINQ provides standard query syntax and methods that work across all data sources through LINQ providers. The main types of LINQ are LINQ to Objects for querying in-memory collections, LINQ to SQL for querying databases, and LINQ to XML for querying XML files. LINQ uses deferred execution, meaning the query is not executed until the results are iterated over or accessed. This provides efficiency by not executing the query until its results are needed.
LINQ (Language Integrated Query) allows programmers to query different data sources like SQL databases, XML files, and object collections without knowing the query language of each data source. LINQ provides standard query syntax and methods that work across all data sources through LINQ providers. The main types of LINQ are LINQ to Objects for querying in-memory collections, LINQ to SQL for querying databases, and LINQ to XML for querying XML files. LINQ uses deferred execution, meaning the query is not executed until the results are iterated over or accessed. This provides efficiency by not executing the query until its results are needed.
o Query Expressions o Using the Method Syntax o Deferred Execution o The from Clause o The select Clause o The Select Method o The SelectMany Method o The where Clause o The orderby Clause o The let Clause o The group-by Clause o Joining Data Sources o The join Clause - Doing an Inner Join o The join Clause - Doing a Group Join o The join Clause - Doing a Left Outer Join o More LINQ Examples o LINQ Aggregate Methods o LINQ to SQL o Querying a Database with LINQ to SQL o Modifying Database with LINQ to SQL o LINQ to XML o Creating an XML Document Using LINQ to XML Language Integrated Query (LINQ) Language Integrated Query (LINQ) was introduced in .NET version 3.5 to allow a programmer to query data from many kinds of data sources without knowing any external language. Querying is the process of obtaining data from a data source. LINQ makes it very easy for you to query data from different kinds of data sources. LINQ is integrated to both C# and VB, and multiple special keywords and syntax for querying using LINQ have been added. Before the arrival of LINQ, programmers write different set of codes for querying different data sources. For example, they have to write codes for querying an SQL database using an SQL command or using XPath for querying XML files. With LINQ now in the programmer's arsenal, querying different data sources requires only the knowledge of the LINQ keywords and methods that were added in .NET 3.5.
Figure 1 - Different Flavors of LINQ There are multiple flavors of LINQ. This is made possible by LINQ providers as seen in Figure 1. Visual Studio already includes some of this providers such as LINQ to Objects. This section of the site will focus on LINQ to Objects which is used to query a collection of objects in your code that implements the IEnumerable<T>interface. Examples of such objects are arrays and lists or a custom collection that you created. There is also a LINQ to SQL which is specifically designed to make it easier to query SQL Server databases. For querying XML files, you can use the LINQ to XML. You can extend LINQ to query more kinds of data sources. You can create you own providers if you want to support querying another type of data source using LINQ. The querying techniques that will be thought in the following lessons can be applied on the different flavors of LINQ. LINQ is made possible by the extension methods that are attached to IEnumerable<T> interface. You can call these methods directly but you need to have a knowledge of lambda expressions. You can also use query expressions which syntax looks like SQL. Query expressions is the main tool you will use to query data using LINQ although you can call the extension methods and use lambda expressions. C# is an imperative language which means that you write the step by step codes to make something happen, but LINQ promotes declarative programming. This simply means that you tell the computer exactly what you want and the computer will handle everything else. Before LINQ, you can only use imperative programming for querying results. For example, suppose that you want to get all the even numbers from an array. Without LINQ and using imperative style of programming, your code will look like this: List<int> evenNumbers = new List<int>(); int[] numbers = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
foreach(int num in numbers) { if (num % 2 == 0) evenNumbers.Add(num); } Figure 2 - Using Imperative Style of Programming to Query Values You instruct the computer to go through every value in a collection and then retrieve the number that matches the condition in an if statement. Now take a look at the declarative version that uses the query expression syntax. int[] numbers = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
var evenNumbers = from n in numbers where n % 2 == 0 select n; Figure 3 - Using a Query Expression to Query Values Don't mind the new syntax yet as this will be described in later lessons. You can see that the declarative version is much clear as to what your objective really is.
Query Expressions Query expressions are special statements used for querying a data source using the LINQ. LINQ are just extension methods that you call and returns the data that you want. These methods are located in the System.Linq namespace so you must include it when you want to use LINQ in your project. Query expressions are tranlated into their equivalent method syntax that can be understood by CLR. You will learn about using the method syntax for querying data using LINQ in the next lesson. Let's take a look at the first example of using LINQ query expression to query values from a collection. Note that we are using LINQ to Objects so we will simply use a simple array as the source.
foreach (var n in result) { Console.Write(n + " "); } } } } Example 1 1 2 3 4 5 Line 2 imports the System.Linq namespace so that we can use LINQ in our program. Line 10 declares an array of 5 integers containing some values. Line 12-13 is the simplest query expression you can make although it is currently useless right now but we will be studying more forms of query expressions. This query expression simply gets every number from the numbers array which can then be accessed using theresults variable. The structure of a basic query expression is as follows: var query = from rangeVar in dataSource <other operations> select <projection>; Example 2 - Structure of a basic query expression Note that you can write a query expression in a single line, but a good practice is to seperate each clause into multiple lines. Each line of the formatted query expression above is called a clause. There are seven types of clauses you can use in a query expression which includes, from, select, where, orderby, let, join, and group-by clauses. For now, we will be only looking at the from and select clauses. Query expressions begin with the a from clause. The from clause uses a range variable (rangeVar) which will temporarily hold a value from the dataSource, followed by the in contextual keyword and then the data source itself. This can be compared to the mechanisims of the foreach loop where a range variable will hold each value retrieved from the source. But the range variable in a from clause is different as it only acts as a reference to each successive element of the data source. This is because of the deferred execution mechanism which you will learn later. The range variable automatically detect its type using type inference based on the type of every element from the data source. After a from clause, you can insert one or more where, orderby, let, or join clauses. You can even add one or more from clauses which will be demonstrated in a later lesson. At the end of a query expression is a select clause. Following the select keyword is a projection which will determine the shape or type of each returned element. For example, if a value following the select clause is of type int, then the type of the query will be a collection of integers. You can even peform more kinds of projection and transformation techniques which will be demonstrated in a later lesson. Note that a query expression can also be ended with a group- by clause but a seperate lesson will be dedicated for it. So to wrap up, a typical query expression starts with a from clause with a range variable and a data source, then followed by any of the from, where, orderby, let, or join clauses, and finally the select or group-byclause at the end of the query expression. If you know SQL, then the syntax of the query expression might look wierd to you because the from clause is placed first and the select clause is placed last. This was done so that Visual Studio can use the Intellisense feature by knowing what type an item of the data source is in advance. The keywords used in a query experession such as from and select are examples of contextual keywords. They are only treated as keywords during specific events and locations such as a query expression. For example, you can simply use the word select as a variable name if it will not be used in a query expression. To see the complete list of contextual keywords, you can go to this link. The result of the query is of type IEnumerable<T>. If you will look at our example, the result was placed in a variable of type var which means it uses type inference to automatically detect the type of the queried data. You can for example, explicitly indicate the type of the query result like this: IEnumerable<int> result = from n in numbers select n; but it requires you to know the type of the result in advance. It is recommended to use var instead to take advantage of a lot of its features. Line 15-18 of Example 1 shows the the value of every data from the query. We simply used a foreach loop but you must also take note that we used var as the type of the range variable. This allows the compiler to simply detect the type of every data in the query result. The LINQ has a feature called deferred execution. It means that the query expression or LINQ method will not execute until the program starts to read or access an item from the result of the query. The query expression actually just returns a computation. The actual sequence of data will be retrieved once the user asks for it. For example, the query expression will be executed when you access the results using a foreachloop. You will learn more about deferred execution a little bit later. We have successfully written our very first query expression, but as you can see, it does nothing but to query every data from the data source. Later lessons will show you more techniques such as filtering, ordering, joining, and grouping results. We will look at each of the seven query expression clauses in more depth.
Using the Method Syntax LINQ is composed of extensions methods that are attached to the IEnumerable<T> interface. These methods exist in the System.Linq namespace and are members of the Enumerable static class. If you can recall, extension methods are special kinds of methods that are use to extend pre-existing types from the .NET class library or to a class in which you don't have access to the source code. For example, you can add a ToTitleCase() method to the System.String type which you can simply call using ordinary strings to change their case style to title case. Let's examine Select<TSource, TResult>() method from the System.Linq namespace and look at its declaration, you can see that it is attached to theIEnumerable<T> interface. public static IEnumerable<TResult> Select<TSource, TResult>( this IEnumerable<TSource> source, Func<TSource, TResult> selector) The first parameter of an extension method determines which type to extend. It is preceded by the thiskeyword followed by the type to extend and an instance name. You can also see that the return type of this method is IEnumerable<T>. This will allow you to nest or chain method calls as you will see later. I have said in an earlier lesson that calling the LINQ methods directly requires you to use lambda expressions. Although using annonymous methods are okay, lambda expressions are much simpler and shorter and makes your code more readable. I therefore assume that you have a good knowledge of lambda expression before going on with this lesson. You will now be presented with another way to query data using LINQ, and that is by using the method syntax which is simply calling the LINQ methods directly. The .NET Framework contains delegate types that can hold methods with different number of parameters and different return types. Looking at the definition of the Select() method, the second parameter is a generic delegate with a type of Func<TSource,TResult>. The delegate will be able to accept a method that has one parameter of type TSource and a return type of TResult. For example, Func<string,int> will be able to accept a method that has one string parameter and returns an int. Figure 1 shows you some delegates you can use depending on the number of parameters of the method it will hold. Delegate Description Func<T1, TResult> Encapsulates a method that has one parameter of typeTSource and returns a value of type TResult. Func<T1, T2, TResult> Encapsulates a method that has two parameters T1 and T2and returns TResult. Func<T1,T2,T3,TResult> Encapsulates a method that has three parameters and returns a value of type TResult. Func<T1,T2,T3,T4, TResult> Encapsulates a method that has four parameters and returns a value of type TResult. Figure 1 Based on the table, you can already see the pattern and guess the delegates for methods with more number of parameters. The delegate with the highest number of parameters available can have 16 parameters. Back on observing the Select() method, the second parameter accepts a reference to a method with one parameter and returns a value. Let's look at how we can use the Select() method by passing a lambda expression as its parameter. int[] numbers = { 1, 2, 3, 4, 5 };
var result = numbers.Select(n => n);
foreach(var n in result) { Console.Write(n + " "); } Example 2 1 2 3 4 5 Note that the first parameter of an extension method is not actually a parameter but is used to indicate which type to extend. Therefore the second parameter which accepts a lambda expression becomes the only parameter of method Select(). Inside the Select() method, we used a lambda expression that accepts one integer parameter and returns an integer value. Again, if you don't know lambda expressions then it might look wierd to you. What the lambda expression did was to retrieve every number and then add (return the value) to the query result. The code merely queries every number without modifying them. Let's modify the lambda expression inside the Select() method to do something more useful. int[] numbers = { 1, 2, 3, 4, 5 };
var result = numbers.Select(n => n + 1);
foreach(var n in result) { Console.Write(n + " "); } Example 3 2 3 4 5 6 Our lambda expression parameter now retrieves a value from the numbers array, add 1 to its value, and add the new value to the query result. Most methods from the System.Linq return IEnumerable<T>. This allows you to nest calls of LINQ methods. For example, consider the line of code below (ignore the new methods for now). The important thing is you can see how LINQ method calls can be nested or chained. var result = numbers.Where(n => n > 3).OrderBy(n => n).Select(n => n); This lesson only shows how to use the Select() method but there are numerous LINQ methods that we will be looking at in the upcoming lessons. You will also see how to create more exciting results using theSelect() method but for now, the important thing is that you know how to use the method syntax for querying data sources using LINQ. The method syntax is the real way the compiler does LINQ queries. The query expression syntax for querying from data sources is just a layer to simplify calling these extension methods. For example consider the query expression below: var result = from p in persons where p.Age >= 18 order by p.Name select p.FirstName + " " + p.LastName; is translated at compile time into a series of calls to corresponding methods from the System.Linqnamespace. The actual code uses the method syntax as show below: var result = persons.Where(p => p.Age >= 18) .OrderBy(p => p.Name) .Select(p => p.FirstName + " " + p.LastName); You can see the chained method calls (I only aligned them for better readability). This was made possible because most of the LINQ methods return IEnumerable<T>. After calling the Where() method, we used theOrderBy() method on the returned data, and then we used the Select() method on the returned data ofOrderBy(). Now that you know two ways to query data using LINQ, query expressions and method syntax; the question is which one should you use? I recommend using query expressions because it promotes declarative style of programming and it is simpler and easier to read than method syntax. But it is also important that you have knowledge of using method syntax because this is the way CLR read your query expression.
Deferred Execution There is one thing that you must know before we continue to the next lessons. You must know that LINQ uses deferred execution when performing queries. This simply means that the actual query is not executed until you iterate over or access each value from the result. Consider the following example:
int[] numbers = { 1, 2, 3, 4, 5 };
int i = 3;
var result = from n in numbers where n <= i select n;
i = 4;
foreach (var n in result) { Console.WriteLine(n); } Example 1 1 2 3 4 We haven't look at the where operator up to now so I will give a brief description of it for the sake of describing deferred execution. The where operator is used to filter the results from a query based on a specified condition. On the code above, the where operator(line 6) instructs the query to only retrieve values which are less than or equal to whatever the value of i is. We will look at the where operator in more detail in a later lesson but for now, let's concentrate on understanding deferred execution. You can see that 3 was assigned to i at line 3. Therefore, when the query expression in lines 5 to 7 executes, the where operator is comparing each item to 3 and if it is less than or equal to it, then they will be included in the result. You must be assuming now that the result variable contains values 1, 2, and 3, but actually, it only contains a computation on how you can get those values. Now in line 9, the value of i is modified by assigning it a value of 4. Because of deferred execution, this directly affects the result of the query expression in lines 5-7. I have said earlier that this query will only be executed once the program asks for the first item of the result and this can be done using a foreach loop. The foreach loop in lines 11-14 executes the query expression and returns the first item from the result of the query. Note that since we modified the value of i, then the query expression was updated as well. The foreach loop will show values from 1 to 4 instead of 1 to 3. This is one of the use of deferred execution. The program is allowed to update the query expression before actually using or executing it to get values. Deferred execution is just the default behavior of LINQ, but you can also make the execution of the query immediate. You can use another set of methods from the System.Linq namespace which compose ofToArray<T>(), ToList<T>(), ToDictionary<T>(), and ToLookUp<T>(). These are also extension methods for the IEnumerable<T> interface. For example, you can use the ToList<T>() method to convert the result of the query expression into a List<T> collection. Example 2 shows how we can use ToList() to immediately execute the query expression.
int[] numbers = { 1, 2, 3, 4, 5 };
int i = 3;
var result = (from n in numbers where n <= i select n).ToList();
i = 4;
foreach (var n in result) { Console.WriteLine(n); } Example 2 1 2 3 Since the query expression was immediately executed, once the program leaves line 7, the result variable already contains the results of the query. So even if we edit the value of i in line 9, the contents of the result is not changed. This can be seen by iterating each value using a foreach loop. The values shown will be 1 to 3 instead of 1 to 4. Please note that deferred execution is not associated with the query expression itself but to the set of methods that those query expression represent. It is the extension methods of the System.Linq namespace that tells whether the query will use deferred execution or not. For example, the Select() method uses deferred execution while the ToList() method uses immediate execution. For example, the query below uses the method syntax and the Select() method to query everything from an array. var result = numbers.Select(n => n); Since the Select() method uses deferred execution, this will not immediately execute until a foreach loop iterates every result. On the other hand, if you use a method using a deferred execution right after the Select() method, then execution of the methods will immediately take place. var result = numbers.Select(n => n).ToList(); Whichever method is executed last determines whether deferred execution will be used. The following table shows the methods of System.Linq and tells whether calling them will cause deferred execution. Method Execution Method Execution Distinct Deferred Reverse Deferred DefaultIfEmpty Deferred Select Deferred ElementAt Immediate SelectMany Deferred ElementAtOrDefault Deferred SequenceEqual Immediate Except Deferred Single Immediate First Immediate SingleOrDefault Immediate FirstOrDefault Immediate Skip Deferred GroupBy Deferred SkipWhile Deferred GroupJoin Deferred Sum Immediate Intersect Deferred Take Deferred Join Deferred TakeWhile Deferred Last Immediate ThenBy Deferred LastOrDefault Immediate ThenByDescending Deferred LongCount Immediate ToArray Immediate Max Immediate ToDictionary Immediate Method Execution Method Execution Min Immediate ToList Immediate OfType Deferred ToLookup Immediate OrderBy Deferred Union Deferred OrderByDescending Deferred Where Deferred Figure 1 - Methods Using Deferred Exection or Immediate Execution If the table above is too hard to remember, then you can just remember one rule on determining whether a query is deferred. If a query is expected to return a single result such as a single number or object, then it will be immediately executed. When a query will return an IEnumerable<T> result, then most of the time it will use deferred execution. One technique to see if a query returns an IEnumerable<T> result is by hovering over the var keyword of the query expression while you are in the Visual Studio editor. Look whether the help bubble shows that it is an IEnumerable<T> type. If so, then the query will use deferred execution.
The from Clause The from clause defines the data source for which you will get data from. The from clause is responsible for iterating each data from the data source. The from clause starts a query expression so that the rest of the query expression will know what kind of data it is working with. Because of it, Visual Studio can use the Intellisense after the from clause because it now knows the type of every item from the data source. The following shows the basic structure of a from clause. from dataType rangeVar in dataSource It starts with the from keyword which is a contextual keyword in C#. We then need to declare a range variable which is an iterator variable that will hold every value from the data source. You can specify the data type of the range variable which has a more specific use as we will see later. The in keyword is next followed by the actual data source that implements the IEnumerable or IEnumerable<T> interfaces. What happens in a typical query is it iterates each value from the data source and places each value from the range variable. You can then do more kinds of operation to these variable such as testing if it meets a certain condition. Again, remember deferred execution which means that LINQ will not execute until you request a data from the result of the query using a foreach loop or retrieving a single item. Here's an example of a from clause: from int number in numbers The from clause above instructs the computer to retrieve every number in the numbers collection (assuming that the numbers collection is an array or a IEnumerable object). Note the indication of the data type after the from keyword. Most of the time, it is totally fine to just omit this if you are sure of the type of every data from the data source. from number in numbers The compiler can just infer the type of the range variable based on the type of the collection. For example, if the collection is an IEnumerable<Person>, then the range variable will be of type Person. But if you are querying a collection of objects such as an ArrayList, it would be better to explicitly indicate the type of every item from the data source. Consider the following example: ArrayList persons = new ArrayList(); persons.Add(new Person("John", "Smith")); persons.Add(new Person("Mark", "Chandler")); persons.Add(new Person("Eric", "Watts"));
var query = from p in persons select p; You will receive the following error at compile time: Could not find an implementation of the query pattern for source type 'System.Collections.ArrayList'. 'Select' not found. Consider explicitly specifying the type of the range variable 'p'. The compiler demands you to explicitly indicate the type of each item from the data source since ArrayListcan contain objects of different types. To fix this, simply indicate the type that you are expecting for every item in the collection: from Person p in persons The query then does casting on every item to convert every object from the ArrayList into a Person object. If an item from the ArrayList cannot be cast to the type indicated in the query, then an exception will be thrown. The from clause was strategically placed at the beginning of a query expression so that Visual Studio will immediately know the type of data you are working with. As seen below, right after the from clause, you can already enjoy the benefits of the Intellisense feature of Visual Studio.
Figure 1 Joining Multiple Tables Using Multiple from Clauses If you have two data sources, you can actually join them by using multiple from clauses. Consider this simple code that joins two sets of numbers. int[] numberSet1 = { 1, 2, 3, 4, 5 }; int[] numberSet2 = { 6, 7, 8, 9, 10 };
var query = from n1 in numberSet1 from n2 in numberSet2 select String.Format("{0},{1}", n1, n2);
foreach (var n in query) { Console.WriteLine(n.ToString()); } Example 1 (1,6) (1,7) (1,8) (1,9) (1,10) (2,6) (2,7) (2,8) (2,9) (2,10) (3,6) (3,7) (3,8) (3,9) (3,10) (4,6) (4,7) (4,8) (4,9) (4,10) (5,6) (5,7) (5,8) (5,9) (5,10) We used two from clauses which retrieves data from two seperate data sources. Within format the result in the select clause. As you can see in the output, every possible combination was added to the query result. This type of combination is called an inner join. You can also use the join clause which will be tackled in a seperate lesson. Note that a second or third from clause does not necessarily need to immediately follow the first fromclause. You can create a query like this: from n1 in numberSet1 where n1 > 2 from n2 in numberSet2 select String.Format("({0},{1})", n1, n2); A single from clause does not have an equivalent query method in the System.Linq. You can simple use theSelect() and other methods. But for multiple from clauses, you can use the SelectMany() method which will have its own dedicated lesson. Creating Relationships on Classes Multiple from clauses can also effectively be used for related classes. But such classes must be properly structured first. For example, a Customer can have multiple orders which can be considered as one-to-many relationship (one customer to many orders). Let's take a look at how we can properly define our Customerand Order classes. class Customer { public string FirstName { get; set; } public string LastName { get; set; } public List<Order> Orders { get; set; } } Example 2 The Customer class has only three properties to make things simple. The third property defines its relationship with the Order class. It is a List of Order objects which represents the orders made by the customer. The Order class is shown in Example 3. class Order { public string Name { get; set; } public int Quantity { get; set; } } Example 3 Now let's create some Customer objects and assign some orders to their Orders property.
class Program { static void Main(string[] args) { List<Customer> customers = new List<Customer>();
Customer customer1 = new Customer() { FirstName = "John", LastName = "Smith", Orders = new List<Order>() { new Order() { Name = "Pizza", Quantity = 5 }, new Order() { Name = "Chicken", Quantity = 3 }, new Order() { Name = "Salad", Quantity = 7 } } };
Customer customer2 = new Customer() { FirstName = "Allan", LastName = "York", Orders = new List<Order>() { new Order() { Name = "Orange Juice", Quantity = 2 }, new Order() { Name = "Soda", Quantity = 4 } } };
Customer customer3 = new Customer() { FirstName = "Henry", LastName = "Helms", Orders = new List<Order>() { new Order() { Name = "Speghetti", Quantity = 7 }, new Order() { Name = "Nachos", Quantity = 13 } } };
var results = from c in customers from o in c.Orders select new { c.FirstName, c.LastName, o.Name, o.Quantity };
foreach (var result in results) { Console.WriteLine("Name: {0} {1}, Order: {2}, Quantity: {3}", result.FirstName, result.LastName, result.Name, result.Quantity); } } } Example 4 Name: John Smith, Order: Pizza, Quantity: 5 Name: John Smith, Order: Chicken, Quantity: 3 Name: John Smith, Order: Salad, Quantity: 7 Name: Allan York, Order: Orange Juice, Quantity: 2 Name: Allan York, Order: Soda, Quantity: 4 Name: Henry Helms, Order: Speghetti, Quantity: 7 Name: Henry Helms, Order: Nachos, Quantity: 13 As you can see in the output, each customer's order is shown. We can properly organize the output by using the group- by clause which will be discussed later. In lines 7-40 of Figure 3, we created three Customerobjects, each having their Orders property initialize with some Order objects. We added the three createdCustomers in the customers list. Lines 46-48 shows a query expression with two from clauses. The first from clause retrieves a Customerobject from the customers list. The second from clause retrieves an Order from the Orders property of the retrieved Customer. The select clause uses projection to include the FirstName and LastName properties of the customer, and the Name and Quantity properties of the order. Lines 50-54 prints the results. Designing your classes like these is great beceause it is very simple to determine the relationship of objects and the needed data. Database tables are often have these kinds of relationships. Visual Studio has tools that can automatically generate classes that map to these tables together with their relationship to other tables exposed as properties. You will more about this when we reach LINQ to SQL.
The select Clause The select clause in LINQ shapes the data to be included in the query result using projection. Projection is the process of transforming an object into a new form. The select clause of a LINQ query must be placed at the end of the query. You can do many kinds of manipulations to the currently queried value, retrieve certain properties of it, or create an annonymous type based on the queried object's properties. A typical LINQ query returns a collection that implements the IEnumerable<T> interface where Tis the resulting data type of the expression in the select statement. For example, if the select clause indicates to retrieve every person in a Person collection, then the returned collection implements theIEnumerable<Person> interface. If the select statement only selects the FirstName of every person, and assuming FirstName property is of type string, then the returned result implements theIEnumerable<string> interface. You can then use a foreach loop to start the retrieving of results that will be yielded by the query expression. You can simply select the whole queried object by using the value of the range variable. This will select the queried object without any modifications or variations. int[] numbers = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
var results = from number in numbers select number;
foreach (var n in results) { Console.Write("{0, -2}", n); } 1 2 3 4 5 6 7 8 9 10 You can use different kinds of expressions that yield a result during selection. For example, you can query every number in an array of numbers and increment each of them by one. int[] numbers = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
var results = from number in numbers select number + 1;
foreach (var n in results) { Console.Write("{0, -2}", n); } 2 3 4 5 6 7 8 9 10 11 You can also select individual properties or combination of properties during the selection process. For example, you can only select the FirstName of every Person object from a List of Persons. List<Person> people = new List<Person> { new Person("Johnny", "Smith"), new Person("Mark", "Lawrence"), new Person("Jessica", "Fisher"), new Person("Danny", "Mayer"), new Person("Raynold", "Alfonzo") };
var firstNames = from person in people select person.FirstName; Johnny Mark Jessica Danny Raynold You can even combine different properties or other external data to the final value for the selection. For example, you can combine the FirstName property and the LastName property to select every person's full name. var fullNames = from person in people select person.FirstName + " " + .person.LastName; Johnny Smith Mark Lawrence Jessica Fisher Danny Mayer Raynold Alfonzo The above select clause will select the person's FirstName, followed by a space, and then his/her LastName. You can use methods to manipulate the final value to be selected. For example, suppose you want to select every FirstName and convert them into all caps. var namesInCaps = from person in people select person.FirstName.ToUpper(); JOHNNY MARK JESSICA DANNY RAYNOLD You can even select values not related to the data source being queried. For example, we can have a collection of indices and select a value from another collection or array. int[] indices = { 0, 1, 2, 3, 4 }; string[] words = { "Example1", "Example2", "Example3", "Example4", "Example5" };
var result = from index in indices select words[index]; Example1 Example2 Example3 Example4 Example5 We can create annonymous types during selection and include certain properties of each object. For example, suppose we have a Person object that has FirstName, LastName, Age, and Gender properties, and we only want to retrieve their FirstName and LastName, we can create a new annonymous type in the selectclause which only includes the FirstName and LastName properties. var annonymous = from person in people select new { person.FirstName, person.LastName };
foreach (var p in annonymous) { Console.WriteLine(p.FirstName + " " + p.LastName); } The type of the selected result would be an annonymous type having the properties included in the selectclause. You can even create new properties based on the properties of the queried object. var results = from person in people select new { FN = person.FirstName, LN = person.LastName };
foreach (var p in results) { Console.WriteLine(p.FN + " " + p.LN); } Johnny Smith Mark Lawrence Jessica Fisher Danny Mayer Raynold Alfonzo The select clause of the query expression above assigns the values of FirstName and LastName properties to new properties with different names. The type of these new properties are determined by the compiler based on the values being assigned to them thanks to type inference. These new properties are absorbed by the created annonymous type. var results = from person in people select new { Upper = person.FirstName.ToUpper(), Lower = person.FirstName.ToLower() };
foreach (var p in results) { Console.WriteLine("Upper={0}, Lower={1}", p.Upper, p.Lower); } Upper=JOHNNY, Lower=johnny Upper=MARK, Lower=mark Upper=JESSICA, Lower=jessica Upper=DANNY, Lower=danny Upper=RAYNOLD, Lower=raynold The select clause above assigns the FirstName converted to uppercase to a new property named Upper, and it's lowercase version to the new Lower property which are then included in an annonymous type. We then used this new properties inside a foreach loop. var results = from person in people select new { FullName = person.FirstName + " " + person.LastName };
foreach (var p in results) { Console.WriteLine(p.FullName); } Johnny Smith Mark Lawrence Jessica Fisher Danny Mayer Raynold Alfonzo The above query's select clause creates a new property named FullName and assigns the combined value of each person's FirstName and LastName. The resulting annonymous type will have the FullName property which automatically gets the full name of the person. var results = from person in people select new { person.Age, FullName = person.FirstName + " " + person.LastName };
foreach (p in results) { Console.WriteLine("FullName={0}, Age={1}", p.FullName, p.Age); } FullName=Johnny Smith, Age=22 FullName=Mark Lawrence, Age=24 FullName=Jessica Fisher, Age=19 FullName=Danny Mayer, Age=21 FullName=Raynold Alfonzo, Age=25 The preceding query contains a select clause creates an annonymous type with a combination of an orginal property of each person and a new property FullName which contains the combined firstname and lastname of each person.
The Select Method The query method which is the direct equivalent of the select clause is the Select method. TheSelect method allows you to project the results of the query. We can use the Select method by passing a selector predicate which can be supplied using a lambda expression. As an example of the Selectmethod, the following is a simple use of it where it retrieves every item in a collection. Consider the people variable is a collection of Person object. var result = people.Select(p => p); A lambda expression is passed as the selector predicate for the method. The lambda expression has a single parameter p which will hold every Person object in the collection, and the right side of the lambda expression specifies what will be selected, in this case, the Person object itself. The following are more examples of using the Select method. The following are more varieties of using the Select method with different lambda expressions: 1. Select the FirstName of every person from people. var firstNames = people.Select(p => p.FirstName); 2. Select the combined FirstName and LastName of each person to form their fullname. var fullNames = people.Select(p => p.FirstName + " " + p.LastName); 3. Create an annonymous type that contains new property FullName which is the combined FirstName andLastName of each person. var annonymous = people.Select(p => new { FullName = p.FirstName + " " + p.LastName } ); Another overload of the Select method accepts a lambda expression which has two parameters, the first one is the queried element, and the second one is the index of that queried element from the data source. var personsWithIndex = people.Select((p, i) => new { Person = p, Index = i }); The following is its equivalent LINQ query: var personsWithIndex = from p in people select new { Person = p, Index = people.IndexOf(p) }; You can then print values of each person including their index. foreach (var p in personsWithIndex) { Console.WriteLine(String.Format("[{0}] {1}", p.Index, p.Person.FirstName)); } [0] John [1] Mark [2] Lisa You can use the LINQ query syntax for a cleaner more readable code or you can directly call the LINQ methods using the query method syntax if you prefer to use lambda expressions.
The SelectMany Method When objects contains properties which holds a collection of objects, using the Select method is not applicable as it will return a "collection of collection" rather than all the items from those collection properties. We need to use the SelectMany method. The functionality of a SelectMany method is similar to a query expression with multiple from clauses. Let's say we have a class named Company, and it has a property named Employees which holds a collection of Employee type objects. A collection of Companyobjects can then be created. By using Select method, we can only query each company as seen below: var result = companies.Select(c => c); What if we want to access or retrieve all of the employees for every company and flatten the result as a one whole list of items. Let's take a look at the following example:
class Company { public string CompanyName { get; set; } public IEnumerable<Employee> Employees { get; set; } }
class Employee { public string FirstName { get; set; } public string LastName { get; set; } }
class Program { public static void Main(string[] args) { //Create some sample companies with employees List<Company> companies = new List<Company> { new Company { CompanyName = "Company 1", Employees = new List<Employee> { new Employee { FirstName = "John", LastName = "Smith" }, new Employee { FirstName = "Mark", LastName = "Halls" }, new Employee { FirstName = "Morgan", LastName = "Jones" } } }, new Company { CompanyName = "Company 2", Employees = new List<Employee> { new Employee { FirstName = "Winston", LastName = "Mars" }, new Employee { FirstName = "Homer", LastName = "Saxton" }, new Employee { FirstName = "Creg", LastName = "Lexon" } } }, new Company { CompanyName = "Company 3", Employees = new List<Employee> { new Employee { FirstName = "Ben", LastName = "Greenland" }, new Employee { FirstName = "Anthony", LastName = "Waterfield" } } } };
//Get All the FirstName of all employees var firstNames = companies.SelectMany(c=>c.Employees).Select(e=>e.FirstName);
var lastNames = companies.SelectMany(c => c.Employees, (c, e) => e.LastName);
Console.WriteLine("FirstNames of all employees from the three companies."); foreach (var firstName in firstNames) Console.WriteLine(firstName);
Console.WriteLine("\nLastNames of all the employees from the three companies."); foreach (var lastName in lastNames) Console.WriteLine(lastName); } } Example 1 - Using the SelectMany Method FirstNames of all employees from the three companies. John Mark Morgan Winston Homer Creg Ben Anthony
LastNames of all the employees from the three companies. Smith Halls Jones Mars Saxton Lexon Greenland Waterfield We have defined two classes for this example. The first one named Company contains two properties named CompanyName and Employees. The Employees property has a type of IEnumerable<Employee> which simply means that it allows you to store a collection of Employee objects to it. The Employee class was defined and has two simple properties to define a single employee. In the Main method, we created a collection of Company objects and initialize it to have three Company objects. Each Company objects was set a value for their CompanyName and a collection of Employee objects for their Employees property. Note that we are using the collection and object initializers here. Lines 52 to 54 uses the SelectMany method. Line 52 queries the FirstNames of all the employees from all the companies. companies.SelectMany(c => c.Employees).Select(e => e.FirstName); This version of SelectMany method accepts only one argument. We pass a lambda specifying the property that the SelectMany will select. As the name of the method implies, "select many" means the method will select a collection of objects. In our example above, we selected the Employees property of every company which contains a collection of Employees. The SelectMany method will now return an IEnumerable<Employee>. This allows us to nest another call to a plain Select method in which, we selected the first name of every employee from the result of the SelectMany method. There is another version of the SelectMany method which accepts two arguments, a collectionSelector and a resultSelector. If we are to rewrite line 52 using this different version of SelectMany, it will look like this. companies.SelectMany(c => c.Employees, (c, e) => e.FirstName); Notice that we don't need to use the Select method to select a specific property of every Employee. The second argument handles that. The second argument accepts a lambda with two arguments. The first is the current item from the original collection, and the second is the current item from the selected property of thecollectionSelector. The where Clause You can filter the results of the query by using the where clause in a query expression. Following the where keyword is the condition (or conditions) that must be met by an item to be included in the results. The great thing with LINQ is that, you can use the available methods in .NET Framework to supplement your condition. This lesson will show you some examples on using the whereclause to filter queried data. You can use relational operators to compare a value or property of an item into another value. For example, suppose we want to retrieve numbers which are greater than 5 from an array of numbers, we can do so using the following LINQ query. int[] numbers = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
var greaterThanFive = from number in numbers where number > 5 select number; The where clause states that the value of a number must be greater than 5 for it to be selected and included. We can even use logical operators for more complex conditions such as the following: var sixToTen = from number in numbers where number > 5 && number <= 10 select number; The query above only retrieves values greater than 5 and less than or equal 10, that is, values 6 to 10. If you have a collection of objects with several properties, you can also test those properties if they met the required condition. List<Person> people = GetPersonList();
var smiths = from person in people where person.LastName == "Smith" select person; The query retrieves every person whose LastName is Smith. We can use .NET methods in our condition. For instance, suppose we want to retrieve all the person whose last name starts with the letter 'R'. var startsWithR = from person in people where person.LastName.StartsWith("R") select person; We used the StartsWith() method of the String class which returns true if a particular string starts with a specified string argument of the method. Alternatively, we can use the Where() extension method from the System.Linq namespace. You can then pass a lambda expression that has one parameter representing every item from data source and its body containing the boolean expression for the condition. The method returns the collection of objects that pass the required condtion. var greaterThanFive = numbers.Where ( number => number > 5 ); The lambda expression's body is a single condition which tests the value of number if it is greater than 5. The following is another example which retrieves persons whose last name starts with 'R'. var startsWithR = people.Where( person => person.LastName.StartsWith("R") ); You can use another overloaded version of the method which accepts a lambda expression having two parameters. The first parameter represents every object from the collection, and the second parameter represents the index of that item in the collection. The following call to Where() method returns numbers whose index in its collection is even. var evenIndices = numbers.Where( (number, index) => index % 2 == 0 ); You can use the Where() method when you simply want to filter a collection using a specified condition.
The orderby Clause LINQ makes it easy for you to sort or modify the order of the results of the query. We can use the orderby clause and specify which value or property will be used as a key for sorting results. For example, suppose we have an array of numbers that we want to query and then arrange the results from highest to lowest, we can use the following query expression: int[] numbers = { 5, 1, 6, 2, 8, 10, 4, 3, 9, 7 };
var sortedNumbers = from number in numbers orderby number select number; The query uses the orderby clause to order the results and we used the number or the actual object as the key for ordering those results. When you use the actual object, then their default behavior when being compared is used. For example, the above statement will compare the values of each queried integer and tests which is larger or smaller than the other one. The default behavior of the orderby clause is to sort the results from lowest to highest. We can explicitly command the query to sort it in ascending order by using the ascending keyword. var sortedNumbers = from number in numbers orderby number ascending select number; To reverse its effect, that is, to sort values from highest to lowest, we can append the descending keyword in the orderby clause. var sortedNumbers = from number in numbers orderby number descending select number; In cases where complex types are involved, you can use their properties as the key for ordering the results of a query. For example, suppose we have a collection of Person objects which has FirstName, LastName, andAge properties. We can sort the results from the youngest to the oldest person using the following query. var sortedByAge = from person in people orderby person.Age select person; Again, we can use the descending keyword if you want to reverse the order of the query. We can also sort the results alphabetically based on each person's last name. var sortedByFirstName = from person in people orderby person.LastName select person; What if two persons have the same last name? How will those items with the same values be sorted? We can specify multiple values in the orderby clause if you want to properly handle cases like these. var sortedByFnThenLn = from person in people orderby person.LastName, person.FirstName select person; We seperate each soring key in the orderby clause by commas. The query above sorts the result based on every person's LastName, and if multiple persons have the same LastName, their FirstName will then be sorted. You can specify even more keys in orderby if for example, two persons still has the same FirstName. Note that using ascending or descending only affects one value in the orderby clause. For example, using a query like this with a descending keyword: var orderByResults = from person in people orderby person.LastName, person.FirstName descending select person; only affects the property preceding it, in this case, FirstName. LastName will be ordered in ascending order by default. You can explicitly specify the order to avoid confusion. var orderByResults = from person in people orderby person.LastName ascending, person.FirstName descending select person; A class can implement the IComparable<T> interface which determines the default way an object can be sorted. The definition of the Person class is persented below showing an implementation of theIComparable<T> interface.
class Person : IComparable<Person> { public string FirstName { get; set; } public string LastName { get; set; } public int Age { get; set; }
public Person(string fn, string ln, int age) { FirstName = fn; LastName = ln; Age = age; }
public int CompareTo(Person other) { if (this.Age > other.Age) return 1; else if (this.Age < other.Age) return -1; else //if equal return 0; } } Example 1 Line 1 specifies that the method implements IComparable<T> interface. We need to replace T with the type of the object to be compared, in this case, Person. Implementing that interface requires you to implement a method named CompareTo() which accepts the other object to be compared to this instance and returns an integer value which is the result of the comparison. Lines 14 to 22 defines that method and inside it, we tests whether the Age of this instance is greater than, less than, or equal to the other Person being compared. A value greater than 0 is returned if value from this instance is greater than the value of the other. A value less than 0 is returned if value of this instance is less than the other, and 0 if both values are equal. We can simply use 1 for greater than, -1 for less than, or 0 for equal. Now that a class implements theIComparable<T> interface, we can simplify the LINQ query. Since we used the Age property inside theCopareTo method, not specifying a property as a key for ordering the results in a LINQ query will sort the results based on each person's age. List<Person> people = new List<Person>() { new Person("Peter", "Redfield", 32), new Person("Marvin", "Monrow", 17), new Person("Aaron", "Striver", 25) };
var defaultSort = from person in people orderby person select person;
foreach (var person in defaultSort) { Console.WriteLine(String.Format("{0} {1} {2}", person.Age, person.FirstName, person.LastName)); } 17 Marvin Monrow 25 Aaron Striver 32 Peter Redfield The OrderBy() and OrderByDescending() Methods The OrderBy() and OrderByDescending() methods in System.Linq are the corresponding LINQ methods of the orderby clause. The OrderBy() method sorts the results in ascending order based on a specified key, and the OrderByDescending() method is the opposite which sorts the results in descending order. We can pass a lambda expression specifying the key to be used for sorting. int[] numbers = { 5, 1, 6, 2, 8, 10, 4, 3, 9, 7 };
List<Person> people = GetPersonList();
var orderByQuery1 = numbers.OrderBy( number => number );
var orderByQuery2 = numbers.OrderByDescending( number => number );
var orderByQuery3 = people.OrderBy( person => person.FirstName );
var orderByQuery4 = people.OrderByDescending( person => person.Age ); The ThenBy() and ThenByDescending() Methods If you have multiple keys that you want to use, then you can use the ThenBy() and ThenByDescending()methods. For example, the LINQ query: var orderByQuery5 = from p in people orderby p.LastName, p.FirstName select p; is equivalent to the following. var orderByQuery5 = people.OrderBy(p => p.LastName).ThenBy(p => p.FirstName); You can use the ThenByDescending() method if you want the following keys to sort results in descending order. var orderByQuery6 = people.OrderBy(p=>p.LastName).ThenByDescending(p=>FirstName);
var orderByQuery7 = people.OrderByDescending(p=>p.LastName) .ThenByDescending(p=>p.FirstName); Note that ThenBy() and ThenByDescending() are methods of IOrderedEnumerable<T> interface and theOrderBy() methods and the orderby clause returns a collection implementing this interface. Therefore, you must use OrderBy() or OrderByDescending() before calling ThenBy() or ThenByDescending(). Using an IComparer<T> Interface Object An overloaded version of the four discussed ordering methods accepts a second comparer object argument. We can create a comparer class which implements the IComparer<TKey> interface. class FirstNameComparer : IComparer<Person> { public int Compare(Person x, Person y) { return x.FirstName.CompareTo(y.FirstName); } } The interface will require you to create an implementation for it's Compare() method which is similar to theCompareTo() method of the IComparable<T> described eariler, except that it accepts two arguments which are the objects to be compared. We simply use the CompareTo() method of the String class which also returns and integer. We compared the FirstNames of the persons being compared . We can then pass an instance of this comparer as a second argument to the OrderBy, OrderByDescending,ThenBy, or ThenByDescending. var orderByQuery8 = people.OrderBy(person => person, new FirstNameComparer()); The second argument will then affect which key to use when ordering the results of a query. The above method call simply uses the whole person as the key, and with the help of the FirstNameComparer object, the method automatically orders the result by each person's FirstName property. An alternative for using methods for ordering in descending order and using the descending keyword is theReverse() method. var orderByQuery9 = (from person in people orderby person.FirstName select person).Reverse(); var orderByQeury10 = people.OrderBy(person => person.FirstName).Reverse(); On the first example, the whole query is enclosed in parentheses to treat it as one and then we used theReverse() method to the result of the query. The second example is the same as the first but uses the method syntax.
The let Clause The let clause allows you to store a result of an expression inside a query expression and use it on the remaining part of the query. A let clause is useful when the query is using an expression multiple times in a query or if you simply want to create an short alias to a very long variable or property name. List<Person> persons = new List<Person> { new Person { FirstName="John", LastName="Smith" }, new Person { FirstName="Henry", LastName="Boon" }, new Person { FirstName="Jacob", LastName="Lesley" } };
var query = from p in persons let fullName = p.FirstName + " " + p.LastName select new { FullName = fullName };
foreach (var person in query) { Console.WriteLine(person.FullName); } Example 1 In the query expression in Example 1, the result of an expression to combine a person's FirstName andLastName properties and it store it to a variable for easy access. The rest of the query expression after the let clause can now use the variable containing the result of the expression. As you can see in the select clause of the query expression in Example 1, we used the newly created variable from the let clause and to create a property named FullName during projection. The let clause doesn't have an equivalent query method. The following rewrites the query expression above using the method syntax. var query = persons.Select(p => new { fullName = p.FirstName + " " + p.LastName }) .Select(p => new { FullName = p.fullName }); The following are more examples of using the let clause. int[] numbers = { 1, 2, 3, 4, 5 };
//Query Expression var query1 = from n in numbers let squared = n * n select squared;
//Method Syntax var query2 = numbers.Select(n => new { squared = n * n }) .Select(n => n.squared); Example 2 //Query Expression var query1 = from p in persons let isTeenager = p.Age < 20 && p.Age > 12 where isTeenager select new { p.FirstName, p.LastName };
//Method Syntax var query2 = persons.Select(p => new { isTeenAger = p.Age < 20 && p.Age > 12 }) .Where(p => p.isTeenager) .Select(p => new { p.FirstName, p.LastName }); Example 3 The above query uses the variable created by the let clause as an argument to the where clause since the expression in the let clause is a boolean expression. The group-by Clause The group-by clause is used to group items using a specified key. The group-by clause can be inserted between a from clause and a select clause or you can put a group-by clause as the final clause of a query expression. The basic structure of a group-by clause is: group item by key into groupVar The group-by clause starts with the group contextual keyword followed by the item to be grouped. This is then followed by the by contextual keyword. After it is the key which determines the criteria or how the items from the data source will be grouped. For example, you can indicate the city as the key to group every item by their respective cities. Next is the into keyword which specifies that the item will be placed in a separate collection(groupVar) that represents a group. The groupVar has a type ofSystem.Linq.IGrouping<TKey, TElement> which exposes the Key property of type TKey and contains grouped elements of type TElement. As for an example, say you want to group some players into different teams. Example 1 first declares a simple class named Player (lines 6-10) which has properties Name, and Team.
using System; using System.Collections.Generic; using System.IO; using System.Linq;
public class Player { public string Name { get; set; } public string Team { get; set; } }
public class Program { public static void Main() { List<Player> players = new List<Player> { new Player { Name = "Johnny", Team= "Red Team" }, new Player { Name = "Ross", Team = "Blue Team" }, new Player { Name = "Eric", Team = "Black Team" }, new Player { Name = "Josh", Team = "White Team" }, new Player { Name = "Mandy", Team = "Blue Team" }, new Player { Name = "Flora", Team = "White Team" }, new Player { Name = "Garry", Team = "Red Team" }, new Player { Name = "Joseph", Team = "Blue Team"}, new Player { Name = "Murray", Team = "Black Team"}, new Player { Name = "Henry", Team = "Black Team"}, new Player { Name = "Watson", Team = "Red Team"}, new Player { Name = "Linda", Team = "White Team"} };
var groups = from p in players group p by p.Team into g select new { GroupName = g.Key, Members = g };
foreach (var g in groups) { Console.WriteLine("Members of {0}", g.GroupName);
foreach (var member in g.Members) { Console.WriteLine("---{0}", member.Name); } } } } Example 1 Members of Red Team ---Johnny ---Garry ---Watson Members of Blue Team ---Ross ---Mandy ---Joseph Members of Black Team ---Eric ---Murray ---Henry Members of White Team ---Josh ---Flora ---Linda We created a list of players (line 16-30) and assign each player their own team. Our goal is to group the players based on the team they are assigned to. The query expression in lines 32-34 uses a group-byclause(line 33) to group each player by looking at their Team property. Each player with similar Team values will be grouped together into an IGrouping<string, Player> collection. Since we used Team as the key which is of type string, the type of the Key property will be of type string as well and its contents are of type Player which represents each object included to the group. The select clause(line 34) then projects the result using an annonymous type with the Key of every grouped assigned to the GroupName property and all the items in the group is assigned to the Members property. Lines 36-44 executes the query using a foreach loop. We need to use a nested foreach loop. The firstforeach loop will loop through every group and print the name of the group. We used the GroupNameproperty which uses the Key property of every IGrouping objects from the query result. Inside it, another foreach loop will then enumerate every item included in the group. Remember that we assigned the group itself in the Members property during the projection in the select clause of the query statement. We used theMembers property as the source in the inner foreach loop to iterate through each member of the group. Ending Query Expressions with group-by Clause You can also use the group-by statement to end a query expression. Remember that a query expression can only be ended by either a select clause or a group-by clause. The following is an example of a query expression with a group-by clause at its end. var groups = from p in players group p by p.Team;
foreach (var g in groups) { Console.WriteLine("Members of {0}", g.Key);
foreach (var member in g) { Console.WriteLine("---{0}", member.Name); } } Figure 1 The query expression above produces the same output as Example 1. You will notice that we don't need to assign every player into a group variable. The resulting type of the query result is anIEnumerable<IGrouping<string, Player>> interface which means that it is a collection of groups with stringkey and contains Player objects. Inside the first foreach loop, we used the Key property of every group which contains the name of each group since we used Team property as the key in the group-by clause. The nested loop simply uses the actual group as the data source to iterate through each Player contained in that group. One thing to note about group-by clause, any range variable before the group-by clause will be out of scope after the group-byclause. The GroupBy() Method A group-by clause is translated into a call to the GroupBy() method which is also an extension method of theIEnumerable<T> interface. The following example shows how you can write the query in Figure 2 using theGroupBy() method. var groups = players.GroupBy(p => p.Team); The GroupBy() method accepts a lambda expression that has one parameter which will hold every value from the data source, and returns the key that will be used for grouping. The query in Example 1 can be done by following the call to the GroupBy() method with a call to the Select()method to project the data. var groups = players.GroupBy(p => p.Team) .Select(g => new { GroupName = g.Key, Members = g }); The Select() method uses every group yielded by the GroupBy() method and uses projection to create a much clearer set of properties.
Joining Data Sources There are times where you want to combine values from different data sources into just one result. Of course those data sources must be related in some way. In LINQ, you can use thejoin clause or the Join() method to join multiple data sources with properties or fields that can be test for equivalency. With the join clause or the Join() method, you can do inner joins, group joins, or left outer joins. The concepts of joins in LINQ can be compared to joins in SQL. If you know how to do joins in SQL, then you may find the following concepts very familliar. Joins can be very hard to understand for a begginer so I will try my best to explain every concept as clear as possible. You will see how to use the join clause in the next lesson. Consider an Author database table containing names of authors and their respective AuthorId. Another table named Books which contains records of books with their titles and the AuthorId of the author that wrote them. One can join this two tables which means, each record of the result of a query is a combination of values from each of the table. A combined record for example, will have the Name of the author and theTitle of the book. For two data sources to be joint together, each item or record must have a key that will be tested for equivalence. Only the two records which have equivalent keys will be combined. In our Authors and Books example, we can add an AuthorId field to both of the tables. An author can have an AuthorIdthat will uniquely identify him, while a book can have an AuthorId that determines which author wrote that book. In a join, there is an inner data source and an outer data source. The inner data source contains items which will be combined to the outer data source. Each of the inner item searches for a matching outer item and the two items are joined to create one new record. The following lessons discusses three types of joins. Inner joins allow you to combine two data sources and create a rectangular result. During an inner join, outer items that have no corresponding inner item are not included in the result set. Inner joins are the simplest and easiest type of join. Another type of join is the group join, which produces a hierarchical result set. It groups related items from one source by an item from another table. For example, you can place all the books written by a certain author into a group. Left outer joins is similar to an inner join as it also creates a rectangular result set, but it also includes outer items which has no corresponding inner item. You will learn more about each of this types of join in the following lessons. Note that you can also do joins using multiple from clauses but it will require you to properly structure your classes when defining them. For example, an Author can have a property named Books, which contains a collection of Book objects that the author writes. Join clause can be effectively used if both classes has no defined relationship. We just need to define the key property to be compared during the join operation.
The join Clause - Doing an Inner Join Inner joins are the simplest type of join. This type of join returns a flat or rectangular result. It means that when you look at the result of the query, it would look like a table in which every cell has a value. Suppose we have two database tables named Authors and Books. The Authors table contains data of different authors. The Books table contains data of different books along with the ID of author that wrote a particular book. Joining the two tables using inner join will yield a rectangular result as seen in Figure 1. Author Book John Smith Little Blue Riding Hood John Smith Snow Black John Smith Hanzel and Brittle Harry Gold My Rubber Duckie Harry Gold He Who Doesn't Know His Name Ronald Schwimmer The Three Little Piggy Banks Figure 1 - A Rectangular Result An author can have multiple books assigned to him. In Figure 1, you can see that John Smith wrote three books, Harry Gold wrote two, and Ronald Schwimmer wrote only one book.
Figure 2 - Inner Join For every item in the inner data source, it searches for its corresponding item in the outer data source and creates a combined item of that outer item and the inner item. After the first inner item finds its corresponding outer item, the second inner item is next and it also searches for its corresponding item in the outer data source. This repeats until all the inner items from the inner data source have found their corresponding item. To determine if two items are equal, both items must have a member or property which should have equal value. As you can see in Figure 2, Outer Item 4 was not included in the results. Any outer items that has no corresponding inner item is not included in the results. Same goes to an inner item that has no corresponding outer item. Let's take a look at the following example. The following code defines two classes, Author and Book. class Author { public int AuthorId { get; set; } public string Name { get; set; } }
class Book { public int AuthorId { get; set; } public string Title { get; set; } } As you can see in the above code, both Author and Book has the same AuthorId property. This will be the property that will determine the correspondence of a Book object to an Author object. Note that the properties doesn't need to have the same name but they should have the same data type. Example 1 creates Author and Book objects. Each book has its AuthorId pointed to a specific Author.
class Program { public static void Main(string[] args) { Author[] authors = new Author[] { new Author() { AuthorId = 1, Name = "John Smith" }, new Author() { AuthorId = 2, Name = "Harry Gold" }, new Author() { AuthorId = 3, Name = "Ronald Schwimmer" }, new Author() { AuthorId = 4, Name = "Jerry Mawler" } };
Book[] books = new Book[] { new Book() { AuthorId = 1, Title = "Little Blue Riding Hood" }, new Book() { AuthorId = 3, Title = "The Three Little Piggy Banks" }, new Book() { AuthorId = 1, Title = "Snow Black" }, new Book() { AuthorId = 2, Title = "My Rubber Duckie" }, new Book() { AuthorId = 2, Title = "He Who Doesn't Know His Name" }, new Book() { AuthorId = 1, Title = "Hanzel and Brittle" } };
var result = from a in authors join b in books on a.AuthorId equals b.AuthorId select new { a.Name, b.Title };
foreach (var r in result) { Console.WriteLine("{0} - {1}", r.Name, r.Title); } } } Example 1 John Smith - Little Blue Riding Hood John Smith - Snow Black John Smith - Hanzel and Brittle Harry Gold - My Rubber Duckie Harry Gold - He Who Doesn't Know His Name Ronald Schwimmer - The Three Little Piggy Banks The query expression in lines 23-25 uses a join clause. The query expression first gets an Author object from the outer data source which is the authors. Then in the join clause, we retrieve a Book object from the inner source named books and test if the AuthorId of the retrieved author is equal to the books AuthorId. Note that the equals keyword was used instead of ==. The join clause can only test for equality. You cannot use operators such as > or < to compare keys. So to remind you that you can only do equality comparison, Microsoft created an equals keyword instead. If the current book's AuthorId is equal to an author's Author ID, then we can proceed to the select clause where the Name of the author and the Title of the book will be projected. Each of the book is iterated and compared to each author and all of the books with a corresponding author will be included in the results when execution of the query begins. Note that Jerry Mawler was not included in the results because he did not right any book, therefore, there is no item in the inner source that can be joined to him. Also, any book which AuthorId does not exist in the authors will not be included. The join clause, like any other clause, is translated during compilation into a call to the Join method. The query expression in Figure 4 can be translated into the following call to the Join method: var result = authors.Join(books, author => author.AuthorId, book => book.AuthorId, (author, book) => new { author.Name, book.Title } ); This version of Join method performs an inner join. The method accepts four parameters. The first parameter accepts the inner data source, in this case, the books collection. The second parameter accepts a lambda expression that describes the outer key selector. It should have one parameter which will hold the every item from the outer source and then return the key that will be used for joining (we used the AuthorId as the key). The third parameter accepts a lambda expression which has one parameter that will hold every item from the inner data source, and returns the key to be compared to the outer source's key. Finally, the fourth parameter accepts a lambda expression with two parameters, the outer item and inner item that have matching keys. You can then use projection to create the type with the results from the two items.
The join Clause - Doing a Group Join With the join clause, you can also do group joins. A group join groups the items from the inner data source by their corresponding item from the outer data source. For example, all the books written by John Smith will be grouped together and all the books written by Harry Gold will have a separate group. The diagram below shows how group join works.
Figure 1 All the inner items which has a common key and has a matching key in the outer data source is grouped together to form one group. As you can see, the result of the group join is a collection of groups, each representing a group for a specified key. Again, for example, the key could be the author of the book. You can group a collection of books by authors and the result will be a collection of books grouped by authors. Any outer item that has no matching inner items will produced an empty group but still included in the result. Let's take a look at an example of doing a group join. We will define two classes named Author and Book. class Author { public int AuthorId { get; set; } public string Name { get; set; } }
class Book { public int AuthorId { get; set; } public string Title { get; set; } } The following code contains a query expression that uses a group join using the join clause.
Author[] authors = new Author[] { new Author() { AuthorId = 1, Name = "John Smith" }, new Author() { AuthorId = 2, Name = "Harry Gold" }, new Author() { AuthorId = 3, Name = "Ronald Schwimmer" }, new Author() { AuthorId = 4, Name = "Jerry Mawler" } };
Book[] books = new Book[] { new Book() { AuthorId = 1, Title = "Little Blue Riding Hood" }, new Book() { AuthorId = 3, Title = "The Three Little Piggy Banks" }, new Book() { AuthorId = 1, Title = "Snow Black" }, new Book() { AuthorId = 2, Title = "My Rubber Duckie" }, new Book() { AuthorId = 2, Title = "He Who Doesn't Know His Name" }, new Book() { AuthorId = 1, Title = "Hanzel and Brittle" } };
var result = from a in authors join b in books on a.AuthorId equals b.AuthorId into booksByAuthor select new { Author = a.Name, Books = booksByAuthor };
foreach (var r in result) { Console.WriteLine("Books written by {0}:", r.Author);
foreach (var b in r.Books) { Console.WriteLine("---{0}", b.Title); } } Example 1 Books written by John Smith: ---Little Blue Riding Hood ---Snow Black ---Hanzel and Brittle Books written by Harry Gold: ---My Rubber Duckie ---He Who Doesn't Know His Name Books written by Ronald Schwimmer: ---The Three Little Piggy Banks Books written by Jerry Mawler: Take a look at the join clause in line 20. The join clause will join a book from the books data source to theauthors data source in which the AuthorId of the book is equal to the AuthorId of an author. The intokeyword signifies the group join followed by a grouping variable. All the inner items' key that corresponds to an outer item's key will be grouped together and will be stored in the grouping variable. The select clause in line 21 projects the result to a new variable with an Author property and a Books property assigned with the group variable. The result of the query expression is a collection of groups of Books. The nested foreach loop in lines 23 to 31 shows the results of the query. Inside the first foreach loop, the name of the author is shown. After that, an inner foreach loop iterates through each of the Book in the author's Book property. Remember that this property contains the collection of books grouped together for a particular author. As you can see in the output, Jerry Mawler written no books so his Books property is empty, therefore, no books were shown. The GroupJoin method is the equivalent method of a join-group-by clause. The equilent query using theGroupJoin method is shown below: var result = authors.GroupJoin(books, author => author.AuthorId, book => book.AuthorId, (author, booksByAuthor) => new { Author = author.Name, Books = booksByAuthor }); The first parameter is the inner data source that will be joined to the outer data source. The second parameter is a delegate that accepts a lambda expression to determine the outer key to used for joining. The third parameter determines the inner key and the final parameter is used to create the group and project each result.
The join Clause - Doing a Left Outer Join Using LINQ's join clause, you can also perform a left outer join. Like an inner join, a left outer join also returns a flat result. An inner join omits any item that has no corresponding items from another data source. As an example, if an author wrote no book, then he will be omitted in the result of the query. The left outer join includes even the items that has no corresponding partner in the result. This is made possible using the DefaultIfEmpty method.
Figure 1 - Left Outer Join Let's take a look at an example of doing a left outer join.
Author[] authors = new Author[] { new Author() { AuthorId = 1, Name = "John Smith" }, new Author() { AuthorId = 2, Name = "Harry Gold" }, new Author() { AuthorId = 3, Name = "Ronald Schwimmer" }, new Author() { AuthorId = 4, Name = "Jerry Mawler" } };
Book[] books = new Book[] { new Book() { AuthorId = 1, Title = "Little Blue Riding Hood" }, new Book() { AuthorId = 3, Title = "The Three Little Piggy Banks" }, new Book() { AuthorId = 1, Title = "Snow Black" }, new Book() { AuthorId = 2, Title = "My Rubber Duckie" }, new Book() { AuthorId = 2, Title = "He Who Doesn't Know His Name" }, new Book() { AuthorId = 1, Title = "Hanzel and Brittle" } };
var result = from a in authors join b in books on a.AuthorId equals b.AuthorId into booksByAuthors from x in booksByAuthors.DefaultIfEmpty(new Book {AuthorId=0,Title="None"}) select new { Author = a.Name, x.Title };
Console.WriteLine("{0, -20} {1}", "Author", "Book"); foreach (var r in result) { Console.WriteLine("{0, -20} {1}", r.Author, r.Title); } Example 1 Author Book John Smith Little Blue Riding Hood John Smith Snow Black John Smith Hanzel and Brittle Harry Gold My Rubber Duckie Harry Gold He Who Doesn't Know His Name Ronald Schwimmer The Three Little Piggy Banks Jerry Mawler None To do a left outer join using the join clause, you first need to group join the two data sources. You then perform another query by using the created groups as the data source. You need to callthe DefaultIfEmptyfor each group so whenever a group contains no items, a specified default value will be provided. As you can see in the query expression in lines 19 to 22, the first two lines of the query expression performs a group join by querying every author object and joing every book object which has an equal AuthorIdproperty as the queried author's AuthorId property. The next line performs another query by using the grouped result of the first query as the data source. Notice that we call the DefaultIfEmpty method of the group to yield a default value if the group is empty. The DefaultIfEmpty method accepts one argument, which is an instance of an object which has a similar type as every items of the group. Since each group in our query contains Book items, we created new instance of the Book class and specified some default values for its properties using objection initialization syntax. Based on our data sources, Jerry Mawler (AuthorId 4) has no corresponding book from the books data source. If we simly used an inner join, Jerry Mawler will be gone in the results, but since we used left outer join, Jerry Mawler was included and as you can see in the output of Figure 2, the default value you specified was shown as his book. There is no direct equivalent of a left outer join when you want to use the method syntax. Doing a left outer join using the method syntax requires the combination of the GroupBy method and the SelectMany method. var result = authors.GroupJoin(books, author => author.AuthorId, book => book.AuthorId, (author, booksByAuthor) => new { Author = author, Books = booksByAuthor }) .SelectMany(x=>x.Books.DefaultIfEmpty(new Book{AuthorId = 0,Title = "None"}), (x, y) => new { Author = x.Author.Name, y.Title }); The GroupJoin method simply groups each books by author and projects a new type with an Author property assigned with the author and Books property assigned with the group of books he wrote. We then nested a call to the SelectMany method. The SelectMany method here accepts two parameters. The first is the collection selector which selects item from the result yielded by the GroupJoin method. Notice that we call the DefaultIfEmpty method so if the Books property of an item is empty, then a default set of values specified will be used. The second parameter is the result selector and through here, you can project the final result of the query.
More LINQ Examples Now that you have learn some basic LINQ querying including how to select, filter, and order results of a query, let's take a look at more examples combining the concepts of the past lessons and introducing some new LINQ features as well. The examples will allow you to more familiar to more techniques of using LINQ. I will show the LINQ query and its corresponding method calls to LINQ methods. var query1 = from p in people orderby p.LastName where p.Age >= 18 select new { p.FirstName, LN = p.LastName }; var query1 = people.OrderBy(p => p.LastName).Where(p => p.Age >= 18) .Select(p => new { p.FirstName, LN = p.LastName }); The queries above are equivalent and they both represent sombe combinations of selecting, filtering and ordering results. We must take note of the second version where we used the actual LINQ methods. The dot operator was used immediately after the end of the previous method. The methods or members you can call after the dot operator depends on the value returned by the previous method. Since LINQ methods returns a result implementing IEnumerable<T>, you can nest or cascade method calls one after the other. The final result to be stored in the result variable depends on the final results of the last LINQ method call, in the case above, the Select method. The second version presented can also be called the dot notation style. var query2 = from p in people orderby p.LastName, p.FirstName where p.Age >= 18 && p.LastName.StartsWith("A") select new { FullName = p.FirstName + " " + p.LastName }; var query2 = people.OrderBy(p=>p.LastName).ThenBy(p=>p.FirstName) .Where(p=>p.Age >= 18 && p.LastName.StartsWith("A")) .Select(p=> new {FullName = p.FirstName + " " + p.LastName }); The query selects persons from the people collection ordering the results by LastName and then byFirstName, and whose Age is greter than or equal 18 and has a LastName which starts with A. The query then selects an annonymous type containing the FullName of the person that met the condition in the where clause. We can use the let clause to define another range variable inside a query and assign it with an expression or property of the original range variable. The following shows you an example of using the let clause. var query2 = from p in people let lastName = p.LastName where lastName.StartsWith("A") select lastName; var query2 = people.Select(p=> new { p, lastName = p.LastName }) .Where(pln => pln.lastName.StartsWith("A")) .Select(pln => pln.lastName) We defined a new range variable and assigned the LastName property of the original range variable. We can now use the new range variable in the following clauses. The second version that uses dot notation style shows how we can do the same functionality as the first version. The let clause in a LINQ query is just similar to a Select method. We called the Select method at the very beginning so the following methods will know the modifications made. We can use multiple data sources and compare each of their values against each other. The following uses two from clauses that retrieves values from two integer arrays. var query3 = from x in n1 from y in n2 where x == y select x; var query3 = n1.Intersect(n2); The query first retrieves a value from n1. The retrieved value will then be compared to each value in n2thanks to the second from clause and the where clause. The where clause states that only retrieve the value of x that has an equal value in any of the elements of y. This is also called intersections where only values that both exists in two sets will be returned in the result set. We can simply use the Intersect method to do the same thing. Another example is an object that contains a property containing a collection of more objects. Imagine ourPerson class having a property named Siblings which is a list of siblings (List<Person>) of a person. We can select every sibling of a person and even specify a condition for the selection. var query4 = from p in people from s in p.Siblings where s.Age < p.Age select new { FullName = p.FirstName + " " + p.LastName, SiblingName = s.FirstName + " " + s.LastName }; var query4 = people.SelectMany(p => p.Siblings.Select(s => s).Where(s => s.Age < p.Age)); The first from clause retrieves a Person from the list of Person objects. The second from clause retrieves every Person object in the current Person's Sibling property. The first from clause will only continue to retrieve the next Person object after the second from clause is finish retrieving all of the items from theSiblings property. The where and select clauses in the query also executes for every item retrieve by the latter from clause from its corresponding data source. The second version of the query uses the SelectManymethod. You can see that inside it is a lambda expression selecting every Person from people and we accessed each person's Sibling property and used another Select method followed by a Where method inside the SelectMany method.
LINQ Aggregate Methods LINQ has aggregate operators or methods that allows you to perform mathematical calculations based on the values of all the elements in a collection. We will learn here the different aggregate operators that will simplify the way you do most common tasks involving sets or collections of values. Example of what aggregate operators can do is automatically getting the sum, average, min, max, and how many elements a collection has. The Count and LongCount Methods We can determine the number of elements a collection or array has by using the Count and LongCountmethods. The Count method simply counts the number of elements of a collection. For collections containing a very huge number of elements, we can use the LongCount method instead. The following example shows the use of the Count operator. int[] numbers = { 7, 2, 6, 1, 7, 4, 2, 5, 1, 2, 6 };
Console.WriteLine("numbers has {0} elements.", numbers.Count()); An overload of both Count and LongCount allows you to pass a predicate that will determine which elements to include in the counting. Consider the following example: Console.Write("Number of even numbers in the collection: {0}", numbers.Count(n => n % 2 == 0)); The lambda expression inside the overloaded Count method specifies the condition that an element must meet for it to be included in the counting. The Min and Max Methods The Min and Max methods returns the minimum and maximum values from a collection respectively. The parameterless versions of this method is very simple and returns the expected results. int[] numbers = { 7, 2, 6, 1, 7, 4, 2, 5, 1, 2, 6 };
Console.WriteLine("Min = {0}\nMax = {1}", numbers.Min(), numbers.Max()); For complex objects that does has no implementation for IComparable, that is, a default behavior when being compared, we can use an overloaded of the Min() and Max() methods which accepts a predicate that specifies the key or expression to be used. List<Person> people = GetPersonList(); //Assume the method returns a collection of Person objects
Console.WriteLine("Youngest Age is {0}", people.Min( p=>p.Age )); Console.WriteLine("Oldest Age is {0}", people.Max( p=>p.Age )); We passed a lambda expression specifying that we should use the Age property of each person as a key and therefore, the Min and Max methods returned the minimum and maximum age of all the persons in the collection. The Average Method The Average method is obviously used for determining the average of numeric values. The average is the sum of all the values of every element divided by the number of elements. int[] numbers = { 7, 2, 6, 1, 7, 4, 2, 5, 1, 2, 6 };
Console.WriteLine("Average = {0}", numbers.Average()); Like Min and Max methods, the Average method has an overloaded version which accepts a predicate to determine which key to use when obtaining the average of a complex object such as our example Personclass. List<Person> people = GetPersonList();
Console.WriteLine("Average age of all the persons"); Console.WriteLine("Average = {0}", people.Average( p => p.Age )); The lambda expression states that we use the Age of every person as the key on getting the average value. Therefore, the method Average results to the average Age of all the persons in the collection. The Sum Method The Sum method is also a very simple to understand operator which simply gets the overall sum of all the values of elements in a collection. int[] numbers = { 7, 2, 6, 1, 7, 4, 2, 5, 1, 2, 6 };
Console.WriteLine("Sum = {0}", numbers.Sum); Again, we can use an overloaded version which accepts a predicate that will determine a key from a complex object such as Person. List<Person> people = GetPersonList();
Console.WriteLine("Total age of all the persons"); Console.WriteLine("Total = {0}", people.Sum(p => p.Age)); The expression inside the Sum method instructs the method to add the age of every person together. The Aggregate Method The Aggregate method is a more flexible version of method Sum. Instead of limiting you to just summing up values in a collection, you can define a predicate with two parameters, the first one being the first nubmer or the previous result, and the second one is the second or next number to participate the calculation. int[] numbers = { 1, 2, 3, 4, 5 };
Console.WriteLine( numbers.Aggregate( (n1, n2) => n1 + n2 ) ); The Aggregate method contains a two-parameter lambda expression and inside its body, we add the two numbers. Initially, n1 will contain the first value of the array, and n2 will have the second value. It will be added and stored to n1. The next value (third value) will then be stored to n2, and will be added again to n1 whose result will be stored to n1 and n2 aquiring the value of the next element. This repeats until n2 has the value of the last element. The result of the above call to Aggregate is similar to Sum method since we add every numbers. A great thing about the Aggregate method is we can choose the operator we want on the calculation. Below is a modification of the previous code where the Aggregate returns the overall product of the collection using the multiplication operator. int[] numbers = { 1, 2, 3, 4, 5 };
Console.WriteLine( numbers.Aggregate( (n1, n2) => n1 * n2 ) ); You can even create complex expressions like this. int[] numbers = { 1, 2, 3, 4, 5 };
Console.WriteLine( numbers.Aggregate( (n1, n2) => (n1 / 2) + (n2 / 2) ) ); The Aggregate method above halves the values before adding them to each other. Another overload of the Aggregate method accepts a seed which is simple the one that will take the first slot in the calculation. int[] numbers = { 1, 2, 3, 4, 5 };
Console.WriteLine( numbers.Aggregate( 1, (n1 , n2) => n1 + n2 )); Notice that Aggregate now has two parameters, the first one being the seed and the second one is the predicate specifying how to process each pair of values. Since we specified a seed of 1, the calculation will start at 1 plus whatever the first value of the array is. Since the first value of the array in the code above is 1 and the specified seed is 1, then the first calculation will be 1 + 1 (seed + first). A third overload of the Aggregate method accepts a third argument which specifies the formatting of the result. Console.WriteLine( numbers.Aggregate( 1, (n1, n2) => n1 + n2, n => String.Format("{0:C}", n))); The third parameter is another lambda expression that will format the result into currency format. Using these aggregate methods is a great help for programmers as it saves development time and is really easy to use once you mastered them.
LINQ to SQL LINQ to SQL is a powerful tool which allows developers to access databases as objects in C#. With LINQ to SQL, you are able to use LINQ query operators and methods instead of learning SQL. LINQ to SQL has an API for connecting and manipulating database. LINQ queries and calls to the API methods are then translated to SQL commands which are executed to provide changes or to retrieve queries from the database. LINQ to SQL is a more modern way of accessing database using C# and .NET. Please do note that LINQ to SQL is only applicable when you are using SQL Server as your database. We will be using the Northwind database as our sample database. If you are using the free Visual C# Express then we need to access the actual database file which has a .mdf file extension. If you installed the Northwind database correctly, the file can be found in C:\SQL Server 2000 Sample Databases. If the file extension is not visible, go to Control Panel, choose Folder Options and click the View tab, and uncheck "Hide extensions for known file types", then click OK. We will be using the Northwind.mdf database file. Database files are created with .mdf extensions whenever you create a database in SQL Server. SQL Server Express 2008 has a default folder for storing database files and it is located at C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\Data. So if you executed the scripts for creating the Northwind database as instructed in the first lessons, then you can also find a copy of Northwind.mdf here. We can proceed if once you have possession of the Northwind.mdf file. Visual Studio has a great tool for generating LINQ to SQL classes with the use of Object Relational Designer. You can simply drag and drop tables there and Visual Studio will automatically create the neccessary classes the corresponds to each tables and rows of the specified table and database. You will then see the tables with properties corresponding to its columns or fields. Arrows will also be visible representing relationships between tables.
Figure 1 - Object Relational Designer showing tables, fields and relationships. Classes will be created that represents the rows of each table.For example, we have an Employees table. Visual Studio will automatically singularize the table's name and an object named Employee will be created that will represent each row or records of the particular table. A corresponding class of type Table<TEntity> of the System.Data.Linq namespaces will be created for every table included in the LINQ to SQL Designer. The TEntity is replaced with the class of the row it contains. For example, the Employees table will have a corresponding class of Table<Employee> class. An object of that class will then be created containing a collection of objects for each of its rows or records. Table<TEntity>implements IQueryable<TEntity> interface of the System.Linq namespace. When a LINQ queries an object that implements this interface and obtains results from a database, the results are automatically stored to the corresponding LINQ to SQL classes. For related tables which are connected to each other via foreign keys, for each foreign key a table has, Visual Studio creates a corresponding property with the type and name similar to each row of the table the foreign key points to. Additionally, for a table whose primary key(s) are used as a foreign key by other tables, for each of those foreign tables, a property is also created. This additional properties allow you to call the properties of the foreign table of a current table. For example, two tables named Employees and Companies table both have CompanyID fields. The CompanyID is the primary key of the Companies table, and the CompanyID field of the Employees table is a foreign key pointing to the CompanyID of the Companies table. When Visual Studio creates the corresponding row class for each table, it will also consider the foreign keys. The Employee class for the Employees table will have an additional Company property since one of its columns points to the Companies table. The Company class of the Companies table will have an additional Employee property because the Employees table points to the Categories table. LINQ to SQL also creates a DataContext class which inherits from the System.Data.Linq.DataContext. This class will be responsible for connecting the program and the database. The objects created for each table that you include in the LINQ to SQL designer becomes a property of this class. Visual Studio will automatically create the DataContext in a format <Database>DataContext where <Database> is the name of the database. For example, using our Northwind database, a NorthwindDataContext will be created with properties corresponding to each tables we have included. These properties contains collections of objects representing the rows of each table. For example, our NorthwindDataContext class will have a Employeesproperty which corresponds to the Employees table. This property is a collection of Employee objects representing each rows of the table. The next lesson will show you an example of using LINQ to SQL and connecting your application to the Northwind database using this technology.
Querying a Database with LINQ to SQL We will be creating a Windows Forms Application that allows as to you to query and view records from a particular table using LINQ to SQL classes, SQL Server 2008, and the Northwind sample database. You will learn how to use the Object Relational Designer to generate LINQ to SQL Classes and how to use them in your code. Creating LINQ to SQL Classes Create a new Windows Forms Application and name it LinqToSqlDemo. Once a project is created, we need to add a LINQ to SQL file. Click the Add New Item button in the toolbar and find LINQ to SQL Classes from the list of templates. Name it Northwind and click the Add button.
Once you click the Add button, you will land on the Object Relational Designer containing nothing as off now.
The Toolbox now also contains components used for creating classes and adding relationships. But since we will generate a class from an existing table in a database, we wont be using the components in the Toolbox. A DBML file (Database Markup Language) with extension .dbml will also be created and shown in the Solutions Explorer. Expanding that node will show two more files representing codes for the layout and the actual classes that will be generated. Double clicking the DBML file will also bring you to the Object Relational Desinger. We need to use the Database Explorer window in Visual C# Express. If you are using the full version of Visual Studio, you need to open the Server Explorer window instead. If it is not visible, go to Views > Other Windows > Database Explorer. Open the Database Explorer window and click the Connect to Database icon.
You will be presented with the Choose Data Source Dialog which asks which type data source to use for the connection. Choose SQL Server Database File. Checking the check box allows you to always choose the specified type of data source when you want to add another one.
You will be presented by another window asking for the type of data source and the location of the database files. You can also specify which SQL Server account to use but if you are using an administrator windows user account, then you can simply leave the default option. You can also click the Advanced button to edit more advanced settings about the connection.
Click the Browse button and browse for the Northwind.mdf file. If you have installed it already, it will be located at C:\SQL Server 2000 Sample Databases. Choose the file and click Open. Be sure that the file is not used by other programs. We then need to test the connection. Click Test Connection button and if everything is working properly, you will receive the following message.
The Northwind.mdf will now appear as a child node of the Data Connections in the Database Explorer window. Expand the Northwind.mdf node to be presented with folders representing the different components of the database. Expand the Tables folder to see the different Tables of the Northwind database. We need to drag tables from the Database Explorer window to the Object Relational Designer's surface. For this lesson, drag the Employees table to the Object Relational Designer.
Visual Studio will prompt you whether to copy the Northwind.mdf database file since it will detect that it is located outside your project folder. Clicking Yes will copy the Northwind.mdf file from the origincal location to your project folder. Also note that everytime you run your program, the database file will also be copied to the output directory. You will learn later how to modify this behavior.
After clicking Yes, The Object Relational Designer will now show a class diagram representing a generated class that will hold values of each row in the Employees table. The name of the class is a singularized version of the Table's name. A property with an appropriate type is created for every column in the dragged table. You will see these properties in the Object Relational Designer. If a property conflicts with the name of the class, then it will be numbered. For example, if the class' name is Employee and it has a column namedEmployee as well, then the column's corresponding property will be named Employee1.
As soon as you drag a table to the Object Relational Designer, the DataContext class for the coresponding database will be created. Since we used the Northwind database, the generated DataContext class will be named NorthwindDataContext. Clicking a blank space in the Object Relational Designer will allow you to edit the properties of the DataContext class using the Properties Window. You can also change the properties of the created row class and properties of its members. But leaving the default names and settings for the classes is recommended. If you are curious about the generated classes and wan't to take a look at its implementation, go to Solution Explorer and expand the node for the created DBML file. You will be presented with two files. Double click the one with .designer.cs extension. You will then see how the classes for your tables and DataContext was defined. You should always save the DBML file before using it in your application. Using LINQ to SQL Classes Once the required LINQ to SQL classes have been successfully generated, we can now use them in our application. For our GUI, we will be using a DataGridView control to display the queried records. Head back to the Windows Forms Designer. Drag a DataGridView control from the Toolbox's Data cetegory to the form. Set the DataGridView's Dock property to Fill so it will take up all the space of the form. Then resize the form to a larger size so it will properly show all the records that we will query.
We will be using the following code:
using System; using System.Linq; using System.Windows.Forms;
namespace LinqToSqlDemo { public partial class Form1 : Form { public Form1() { InitializeComponent(); }
var employees = from employee in database.Employees select new { employee.EmployeeID, employee.FirstName, employee.LastName, employee.BirthDate, employee.Address, employee.Country };
dataGridView1.DataSource = employees; } } } Double click the Form's title bar in the Designer to generate a handler for the form's Load event. Add the codes in lines 16-29. The code at line 16 creates a new NorthwindDataContext object. This will be used to access the tables of the database and the rows each table contain. Lines 18-27 uses a LINQ query which access the NorthwindDataContext's Employees property containing each record for employee. The select clause of the query only selects some of the properties of every employee. Line 29 uses the DataGridView'sDataSource property and assigns the result of the query as it's data source. When you run the program, you will see all the records from the Employees table.
Lines 18-27 is a simple LINQ query that retrieves several properties of every employee in the Employeesdatabase. You can perform different LINQ queries that suit your needs. For example, we can modify the LINQ query in 18-27 to only show employees who live in USA. var employees = from employee in database.Employees where employee.Country == "USA" select new { employee.EmployeeID, employee.FirstName, employee.LastName, employee.BirthDate, employee.Address, employee.Country }; You can provide controls, for example, a combo box containing different countries, and modify the query based on the selected country in the combo box.
Modifying Database with LINQ to SQL Mapping the database tables and its records to their corresponding LINQ to SQL classes makes it even easier to manipulate databases. Once LINQ to SQL classes are generated, you can do the modification directly to objects of those class. For adding, the DataContext class offers theInsertOnSubmit and pass the new object of the row class to add. When deleting, we can use theDeleteOnSubmit and pass the specified object to delete. We can directly modify the properties of an object representing a record if we want to update it. All of this operations will not immediately affect the actual tables and records in the database. We need to call the SubmitChanges method of the DataContext class first. To access an element of the property representing the Table, we can use the ElementAt method which accepts an integer index and returns the corresponding record object. Our example application will allow the user to check details of every record, and allows you to add, delete and update records using LINQ to SQL and the method offered by the DataContext and Table<TEntity>classes. For the following example the we will create, we will be using a sample database containing a single table named Persons which contains some records. Download Sample Database Once you downloaded the rar file, open it and extract the database file inside it to a location that you can easily find. We will create an application that queries one person at a time and allows us to move through every record using navigation buttons. The appilcation will also allow the user to add, delete, or update records. You will see how easy this can be done using LINQ to SQL classes. Create a new Windows Forms Application and name the project LinqToSqlDemo2. Create a LINQ to SQL and name it Sample.dbml.. Go to the Database Explorer and click the Connect to Database button. Choose Microsoft SQL Server Database file and click OK then browse for the location of the Sample.mdf file you have downloaded. It will now show up in the Database Explorer as a seperate node. Open the nodes and inside the Tables node, drag the Persons table to the Object Relational Designer. Click Yes to accept the option to copy the database file to your project folder. You will now be presented with a class inside the Object Relational Designer named Person.
As you can see, it only has several properties. We will now create the GUI that we will use to present details of each records and also to add or delete records. Add the neccessary controls and their corresponding text as shown in the GUI below. The numbers will indicate the corresponding names to be used by the controls.
Number Name 1 firstButton 2 prevButton 3 nextButton 4 lastButton 5 idTextBox 6 firstNameTextBox 7 lastNameTextBox 8 ageTextBox 9 addButton 10 deleteButton 11 updateButton Set the idTextBox's ReadOnly property to true so it can't be modified as it will show a primary key value. You can also set the StartPosition property of the form to CenterScreen. The buttons above will be used to move to the first, previous, next, or last record in the Persons table. The text boxes will be used to display the values of every field of the current person. The buttons below are used to Add, Delete, and Update records. Clicking the addButton will clear the textboxes so it can accept new values from the user to be added to the table. Clicking the deleteButton will delete the current record being shown. Clicking updateButton will update the record being shown if some of its details were modified. We will be using the following code for our application:
using System; using System.Collections.Generic; using System.Linq; using System.Windows.Forms;
namespace LinqToSqlDemo2 { public partial class Form1 : Form { private int currentIndex; private int minIndex; private int maxIndex; private bool addPending; private SampleDataContext database; private IEnumerable<Person> persons;
public Form1() { InitializeComponent();
database = new SampleDataContext(); persons = from p in database.Persons select p;
//Add new Person database.Persons.InsertOnSubmit(newPersonnewPerson); database.SubmitChanges(); maxIndex++; currentIndex = maxIndex; DisableButtons(); MessageBox.Show("Successfully added to database.", "Success", MessageBoxButtons.OK, MessageBoxIcon.Information); addButton.Text = "Add"; addPending = false; } catch { MessageBox.Show("Failed to add new record to database. Make sure " + "that every field is not empty and in a correct " + "format", "Failed", MessageBoxButtons.OK, MessageBoxIcon.Error); } } }
private void ClearFields() { idTextBox.Text = String.Empty; firstNameTextBox.Text = String.Empty; lastNameTextBox.Text = String.Empty; ageTextBox.Text = String.Empty; firstNameTextBox.Focus(); } } } Example 1 Lines 10-15 declares some required variables that we will use throughout our program. Line 10 declares a variable that will hold the current index of the person to show. Line 11-12 declares variables that will be used to hold the minimum and maximum possible indices so we can avoid IndexOutOfRangeExceptions and disable specific navigation buttons. Line 13 will be used by the addButton later as we will see. Line 14 declares a SampleDataContext object which is the corresponding DataContext of the Sample database. We will use this object to call methods for adding, deleting, updating, and retrieving records from the tables of the Sample database. Line 15 declares an object of type IEnumerable<Person> which will hold all the person records queried from the database. Recall the results of a LINQ query implements IEnumerable<T> so we can simple use this type in the declaration of the object in line 15. This is so we don't have to query all the records everytime we need to use them. We can simple use this object throughout our program. We will first discuss the utility methods that will be used by the application. The ClearFields method (line 205-212) simply clears every text field and sets the focus to the firstNameTextBox. Method DisableButtons(line 172-203) will be used to disable navigation buttons once the currentIndex reached the minimum or maximum bounds. This is to prevent the user to move when no more elements are available to show. It also checks if there is 1 or 0 records left so it can disable all the navigation buttons to prevent the user from moving. The ShowPersonInfo method (153-169) accepts an index and retrieve a Person object using the specified index. Lines 155-161 first checks if the number of elements or records in the Persons property is empty using the Count method. If so, we print an error message and return to the caller to prevent the other codes from of the method from executing. In line 163, we used the ElementAt method of the Personsproperty to retrieve the right object using the index as an argument to the ElementAt method. We then displayed the properties of the retrieved Person object to their corresponding text boxes. Now let's go inside the Form1's constructor (Line 17 - 33). Line 21 creates an instance of theSampleDataContext so it can now be used to perform operations to the database. Lines 22-23 is a simple LINQ query that selects all the person from the Persons property of the SampleDataContext instance which contains every record of person. We then set the currentIndex to 0 to indicate the program should initially show the first record. Lines 27-28 sets the minIndex and maxIndex which holds the minimum and maximum indices respectively. We simply assign the value 0 to the minIndex. The maxIndex was calculated by obtaining the number of person records in the Persons property and subtracting by 1 because indices are 0-based. We called the DisableButtons method that we created to disable the buttons that the user won't need as of now. We also set the addPending to false. This will be used by the handler of the addButton later. Go back to the desinger and double click the form's title bar to generate an event handler for its Load event (lines 35- 38). Inside the handler, we called the ShowPersonInfo and passed the current value of the currentIndex which is 0 to show the first record in the text boxes. We will now add the Click event handlers for the navigation buttons. In the Designer, double click thefirstButton. Use the codes in lines 42-44 for the event handler. The first line calls the ShowPersonInfo and passing the minIndex value to show the very first record. The value of currentIndex is then set back to the value of minIndex. We called the DisableButtons to disable the firstButton and prevButton. The event handler for lastButton (47-52) is the same as the firstButton's only that it shows the details of the last record using the maxIndex variable. The prevButton and nextButton's Click event handlers (54-64) are also nearly identical. The both call the ShowPersonInfo method and pass the currentIndex to show the records at that specified index. We also increment or decrement the currentIndex right inside the method call to adjust value of the currentIndex. The event handler for addButton (66-106) has the following functionality. The addbutton will have two states. The first state is when addPending is set to false. Clicking the button in this state will first clear the text boxes(line 70). It will then calculate the next possible PersonID to assign for the soon to be added new reacord (71). The calculated new id is displayed to the appropriate text box. The Text of the addButton is changed to "Done" to indicate the it is waiting for the user to finish providing values for each fields of the new person to add. The addPending variable is then set to true to transfer the button to its second state. The second state of the addButton is when it is waiting for the user to finish providing values. When the user hits again the addButton while in this state, it will now execute commands to add the new record to the database. Note that everything that will be used to add the new record to the database is enclosed in a try block so we can catch exceptions that might occur including FormatExceptions or ChangeConflictExceptions which is thrown by the SubmitChanges method when it encounters an error. Lines 81-85 creates a newPerson object and assign its properties to values of the text boxes. Line 88 uses theTable<TEntity>.InsertOnSubmit method and passes the newly created Person object so it will be added to the database when the DataContext.SubmitChanges method is executed. The SubmitChanges method is called in line 89 to submit that new change that was made, that is, that adding of a new record to the list ofPersons in the Persons table. Line 90 adjusts the value of maxIndex by incrementing it by one since the number of records was increased by 1. We set the currentIndex to the value of the maxIndex, calledDisableButtons method, and print a success message telling that the adding of the new record to the database was successful. We changed back the button's caption to "Add" and set the addPending to falseso it can accept again new records once clicked. The catch block simply shows an error message telling the user that problems occur while adding the new record to the database. The handler for the deleteButton (108-131) also encloses it's code in a try block to prevent uncaught exceptions that might occur. Line 112 uses the DeleteOnSubmit method which accepts the object to delete from table. To retrieve the right Person object to delete, we used the ElementAt method and passed thecurrentIndex to it. Since a change was made, we called the SubmitChanges method to send the change to the actual database table. We decrement the maxIndex by 1. We also adjust the currentIndex to an appropriate value. Decreasing the maxIndex while while the currentIndex will make currentIndex greater than the maxIndex. Therefore, we also decrement the value of the currentIndex to match the maxIndex. A person right before the deleted person will then be displayed. The updateButton will be used to update the values of the currently displayed person. When a person's details is displayed, you can change the values in the text boxes, and press the updateButton to update the corresponding record in the database. The handler for the updateButton (133-151) creates a Person that will hold a reference to the currently displayed Person. The properties of this Person is then changed using the new values that the user may have provided. We then immediately called the SubmitChanges method to send the changes and update the corresponding record in the database table. Note that there is no method such as UpdateOnSubmit which could have been similar to the two methods we have seen. You simply modify the values of the properties and call the SubmitChanges method.
LINQ to XML The .NET Framework provided us with several techniques for accessing and manipulating an XML file. In the past, you can use the XmlReader or the different XML Document Object Model classes such as XmlDocument, XmlElement, and XmlComment. You can also use XML querying languages such as XPath or XQuery to select specific elements in an XML document. The problem is, you have to know this languages which really has nothing to do with C#. That's where LINQ to XML comes in. With LINQ to XML, you can use the LINQ operators to query, manipulate, and create XML documents. If you want to learn the old and non-LINQ way of manipulating and creating XML documents, this site already offers some tutorialsfor those. Let's take a look at a simple example of using LINQ to XML on a preexisting XML file containing some data. <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Persons> <Person name="John Smith"> <Age>30</Age> <Gender>Male</Gender> </Person> <Person name="Mike Folley"> <Age>25</Age> <Gender>Male</Gender> </Person> <Person name="Lisa Carter"> <Age>22</Age> <Gender>Female</Gender> </Person> <Person name="Jerry Frost"> <Age>27</Age> <Gender>Male</Gender> </Person> <Person name="Adam Wong"> <Age>35</Age> <Gender>Male</Gender> </Person> </Persons> Example 1 - A Sample XML File Open Visual Studio and create a new Console Application project and name it LinqToXmlPractice. Right click the solution in the Solution Explorer and add a new item. Choose Xml file from the list of template and name it sample.xml. Replace the contents of that file with the one from Example 1. Our sample.xml file contains the XML markup that we will try to query. If you are unfamiliar with XML, there are many good tutorials in the internet that will teach you its concepts. There is also a quick introduction to XML that can be found in this site. The XML file contains one root element named Persons. It contains multiple child elements named Person. Each child element has a name attribute and contains yet another set of child elements, the Age and Gender elements. Now let's write some LINQ query to retrieve, let's say, the names of every person in the XML Document. Open up Program.cs and add using statement for the System.Linq.Xml namespace. using System.Linq.Xml; Inside the Main method, write the following: XDocument doc = XDocument.Load(@"..\..\sample.xml");
var names = from d in doc.Root.Elements("Person") select d.Attribute("name").Value;
Console.WriteLine("Names of every person from the XML file.");
foreach (var name in names) { Console.WriteLine("{0}", name); } Example 2 - Using LINQ to query XML elements Names of every person from the XML file. John Smith Mike Folley Lisa Carter Jerry Frost Adam Wong Full explanation for the codes will be in the upcoming lessons. But for now, you can see how easy it is to get any data you want from the elements of an XML document.
Creating an XML Document Using LINQ to XML If you played with the XML Document Object Model classes to create XML documents from scratch, it will be easy for you to migrate to a brand new set of classes which allows you to create XML elements in a much more natural and easier way. These classes are located inside theSytem.Linq.Xml namespace. The classic way of loading an XML document is by using the XmlDocument class from the System.Xml namespace. XmlDocument doc = XmlDocument.Load("myXmlFile.xml"); There is no difference if you are going to use the newer XDocument class. XDocument doc = XDocument.Load("myXmlFile.xml"); So what makes this new classes have besides from having shorter names compared to their older versions? The answer is functional construction. With functional construction, you don't need to declare the XML DOM elements one by one. You can simply use the overloaded constructors of each X class to suit your needs. This can be seen clearer by looking at an example. By using the old System.Xml's XML DOM classes, the following code is used to create a simple XML file. //A new XML Document XmlDocument doc = new XmlDocument();
//Xml Declaration XmlDeclaration declaration = doc.CreateXmlDeclaration("1.0", "utf-8", "yes"); //Attach declaration to the document doc.AppendChild(declaration);
//Create a comment XmlComment comment = doc.CreateComment("This is a comment"); //Attach comment to the document doc.AppendChild(comment);
//Create root element XmlElement root = doc.CreateElement("Persons"); //Attach the root node to the document doc.AppendChild(root);
//Create a Person child element XmlElement person1 = doc.CreateElement("Person"); //Add an attribute name with value John Smith person1.SetAttribute("name", "John Smith"); //Crate Age element XmlElement person1Age = doc.CreateElement("Age"); person1Age.InnerText = "30"; //Create Gender element XmlElement person1Gender = doc.CreateElement("Gender"); person1Gender.InnerText = "Male";
//Attach Age and Gender element to the Person element person1.AppendChild(person1Age); person1.AppendChild(person1Gender);
//Attach Person child element to the root Persons element doc.DocumentElement.AppendChild(person1);
//Create another Person child element XmlElement person2 = doc.CreateElement("Person"); //Add attribute name with value Mike Folley person2.SetAttribute("name", "Mike Folley"); //Crate Age element XmlElement person2Age = doc.CreateElement("Age"); person2Age.InnerText = "25"; //Create Gender element XmlElement person2Gender = doc.CreateElement("Gender"); person2Gender.InnerText = "Male";
//Attach Age and Gender element to the Person element person2.AppendChild(person2Age); person2.AppendChild(person2Gender);
//Attach second Person child element to the root Persons element doc.DocumentElement.AppendChild(person2);
//Save the constructed XML into an XML file doc.Save(@"C:\sample1.xml"); Example 1 - Using XML DOM classes to create a file The above code will produce the following XML: <?xml version="1.0" encoding="utf-8" standalone="yes"?> <!--This is a comment--> <Persons> <Person name="John Smith"> <Age>30</Age> <Gender>Male</Gender> </Person> <Person name="Mike Folley"> <Age>25</Age> <Gender>Male</Gender> </Person> </Persons> Now let's take a look at using the new LINQ to XML classes to create the very same XML markup. XDocument doc = new XDocument( new XDeclaration("1.0", "utf-8", "yes"), new XComment("This is a comment"), new XElement("Persons", new XElement("Person", new XAttribute("name", "John Smith"), new XElement("Age", new XText("30")), new XElement("Gender", new XText("Male"))), new XElement("Person", new XAttribute("name", "Mike Folley"), new XElement("Age", new XText("25")), new XElement("Gender", new XText("Male")))));
doc.Save(@"C:\sample2.xml"); Example 2 - Using LINQ to XML classes to create an XML document As you can see, using the XML DOM classes to construct an XML document requires you to declare each component and attach each element to their proper parents one by one. Using the LINQ to XML classes, you can see the every part of the XML document you will create is instantiated inside the constructor of every classes. These method of creating XML markup is called functional construction. Each LINQ to XML class' constructor accepts arguments that may serve as child elements or values of the element it represents. For example, the XElement has the following overload of its constructor: XElement(XName name, params object[] content) The first parameter accepts the name of the element. You can simply pass a string as the name and it will automatically be converted to an XName instance. The second parameter is special because it allows you to pass any number of different kinds of objects such as an XText, XAttribute, or an XElement that will be a child element of that current XElement. Other LINQ to XML class only accepts one argument such as XCommentand XText which both accept a string argument that represents the text they will render. The following table shows some of the LINQ to XML classes you can use together what part of an XML document they represent. Class Description XDocument Represents an XML Document Class Description XDeclaration Represents an XML Declaration XElement Represents an XML Element XAttribute Represents an attribute of an XML Element XComment Represents an comment XText Represents the inner text of an XML element. Figure 1 - LINQ to XML Classes For example, if you want to create an element named Person with an name attribute of value John Smith, here's the code to do that. var personElement = new XElement("Person", new XAttribute("name", "John Smith")); If you want to add a child element for that person, simply add another argument to the constructor ofXElement. var personElement = new XElement("Person", new XAttribute("name", "John Smith"), new XElement("Age", new XText("30"))); The child argument has a name of Age and the next argument to its constructor is it's inner text represented by the XText class. You can also see in Example 2 how every XElement argument was indented. This is to make it look more like the way we indent XML elements. You can already see how the XML output will look like simply by looking at the code. At the end of Example 2 is a call to the XDocument.Save method. This method allows you to save the constructed XML which is currently in memory, into an external XML file. The method accepts a string argument which represents the file path of the file. If the file does not exist, then it will be created. If the file is already there, it will be overwritten. There are other overloads of this method, but for now, this version is sufficient to save the constructed XML to a file.