The Microsoft Developer Tools Roadmap 2003-2005 describes the innovations and
enhancements within the "Whidbey" release of Visual Studio and the .NET Framework.
Within this is an overview of the data access support in ADO.NET, covered by
the System.Data and System.Xml
namespaces. This white paper details the innovations and enhancements in the
data access layer in "Whidbey" when retrieving data from data sources, particularly
SQL Server. It covers the growth of ADO.NET in "Whidbey" to encompass the relational,
object and XML data models, and explains typical usage scenarios for each with
development guidelines and recommendations.
Data Is Relational Is Objects Is XML Is Data (ROX)
A driving requirement in the v1.0 release of ADO.NET was the close integration
between relational data and XML, epitomized by the flexibility of the Dataset
class to both read and write XML into relational table structures and with the
ability to persist its internal structure as an XML schema. Also, through the
ExecuteXmlReader method on the SqlCommand
class you are able to issue a FOR XML SQL command against SQL Server to return
XML to the client, utilizing the server's support for generating XML. In the "Whidbey"
release of ADO.NET, this close integration between differing data representations
is further expanded to include objects, i.e., the ability to work with your data
as objects that encapsulate behavior and persist themselves to SQL Server. Furthermore,
the XML support has been extended in ADO.NET by providing XML views over relational
tables to shape your data according to an XML schema. Breaking down the boundaries
between these differing data models so that developers are able to provide layers
of abstraction that suit their application needs is a primary goal of the data
access support in "Whidbey". Data access is no longer simply confined to the relational
table and row model, but is also exposed as objects and XML. Hence the observation
that data is relational is objects is XML is data.
The Big Picture
First, let's provide a 10,000 feet overview of the data access support in "Whidbey".
Relational Data Access in "Whidbey"
The relational APIs in ADO.NET were the primary data access mechanism to SQL
Server and other relational database in the v1.0 release of the .NET Framework.
The key feature improvements in the "Whidbey" release are strictly incremental
on these existing APIs to address performance, usability and to expose functionality
in the latest release of SQL Server, code named "Yukon". Performance is addressed
through asynchronous data access and the ability to perform batch updates, rather
than an update per row as currently performed. Improved usability is provided
through paging support. "Yukon" features are support for User Defined Types
(UDTs) that extend the database scalar type values, i.e., you can define your
own Point class that encapsulates methods and data,
Multiple Active Result Sets (MARS) that provide more than one simultaneous pending
request per connection and support for the new XML data type that provides the
ability to store XML documents in relational tables. There are other features
that are detailed later in this document. In essence the relational APIs provide
the fastest access and greatest control over access to data stored in SQL Server
and other relational databases.
Object Data Access in "Whidbey"
New to ADO.NET in "Whidbey" is Objectspaces, an object persistence data access
layer, built upon SQL Server. This provides persistence for your object model
through an API with declarative mapping to a SQL Server database, providing
object-oriented query services. By working with an object model developers are
abstracted away from the underlying relational table structure thereby allowing
them to concentrate on the application's business rules that are represented
by the VB or C# classes. This is achieved via a mapping layer that maps relational
structure and types to CLR classes and types thereby providing seamless integration
between the programming language and the data storage. The OO paradigm of encapsulation
of behavior as methods and data as properties; that is, the success of the majority
of modern programming languages, coupled with the CLR type system is the essence
and foundation of this data access technology.
XML Data Access and System.Xml in "Whidbey"
SQL Server 2000 has built in support for XML today via SQLXML, a set of native
APIs that expose SQL Server tables as XML using XPath as the query language.
This SQLXML technology has now been integrated into ADO.NET as a fully managed
implementation. This provides streaming and shaping of XML via W3C XML schemas
to and from SQL Server by defining an XML view over the relational tables. In
the same way that Objectspaces provides an abstraction layer over SQL Server
for objects, SQLXML provides an abstraction layer for XML, both of which share
the same underlying mapping technology. The W3C XQuery language is used to query
the data as XML and schemas are used to shape the data. In the case of SQLXML
the mapping layer, or XML Views, map relational structure and types to hierarchal
XML and schema types, thereby providing seamless integration between the XML
data model and the data storage. This enables you to take advantage of layered
XML services, such as schema validation and XSLT transformations, and most important,
share your relational data with business partners as XML documents and Web Services.
Within the System.Xml namespace, there are a number
of enhancements for usability, performance and standards. For usability, the
XmlReader, XmlWriter and XPathNavigator
classes now expose CLR type accessors providing the ability to read and write
CLR types to and from XML documents. A new XmlFactory
class provides a one-stop-shop to create these classes. For increased usability,
the XPathDocument class has been enhanced to become
a better DOM and is set to replace the XmlDocument
class as the preferred XML store. Finally, performance has improved throughout
System.Xml with notable exceptions being 2x performance
improvement for stream-based parsing and writing of XML with the XmlTextReader
and XmlTextWriter and a 4x performance improvement
for XSLT.
Data Access Design Approaches
There is no single answer as to which data access design approach should be used,
since the decisions are always multi-faceted and driven by the types of applications
that you are building. However, the following are guidelines and recommendations
when considering how to retrieve data in the "Whidbey" release of ADO.NET with
the respective strengths and typical usage scenarios. In reality due to the close
integration throughout ADO.NET, combined with the flexibility that they provide,
you will most likely use a combination of these data access technologies to develop
your solution.
ADO.NET Relational APIs
Use the relational APIs if you need the fastest possible data access, finest
grain of control using database features and the richest design time experience.
Strengths
- Highest performance and explicit fine-grained control
- Exposes database functionality
- Best design time experience with Visual Studio
- Operates over multiple RBDMS, e.g., SQL Server and Oracle
Usage Scenarios
- You are comfortable with the relational model
- You require maximum control/performance/functionality
- You are using data bound controls in Windows Forms and ASP.NET applications
- You are creating dynamic reporting style applications
"Whidbey" Features
- Asynchronous Data Access – With true asynchronous data access consistent
with the .NET asynchronous design patterns, this provides for responsive applications.
For client applications, this prevents blocking on the UI thread and for server
applications, this avoids blocking on threads.
- Multiple Active Results support for Yukon – SQL Server has historically
limited a connection to a single active set of results, meaning that once
a command was executed, no additional work could be performed on the connection
until all of the results were retrieved. Users commonly employed server-side
cursors to work around this limitation, resulting in poorly performing and
scaling applications. Multiple Active Results support in Yukon allows results
to be retrieved without tying up the connection and without resorting to less
performance and scalable cursor solutions. Coupled with asynchronous support,
this allows applications to efficiently fire off multiple simultaneous commands
on a single connection and retrieve results as they are available.
- Batch Updates – Currently the DataAdapter Update
method does a round trip per row. With support for batch updates in "Whidbey",
you can set the batch size on the adapter thereby providing less rounds trips
to the server and significant improvements on update processing.
- Paging – By providing paging support in the API, you no longer have
to write the logic to retrieve the next set of results based upon a SQL query.
By providing the starting row and the number of rows to the ExecutePageReader
method, you are able to page results sets back to the client. The one caveat
is that there are no consistency guarantees and rows can be skipped or repeated
if the underlying data changes.
- Common Provider Model – A set of base classes that provide a common
API that enables developers to write data access code against any "Whidbey"
managed provider.
- BulkCopy into SQL Server – In the "Whidbey" release, SqlClient supports
new BulkCopy support for efficiently loading data from a DataSet,
array of DataRows, or any DataReader
into SQL Server.
- SqlNotification support for SQL Server "Yukon" – Traditionally, as
soon as data is retrieved from a database, it can become invalid due to changes
made by other processes to the underlying store. With SqlNotifications,
you can find out when a change to the underlying store would affect the results
of your query, and take appropriate action such as discarding, updating or
marking results as no longer current.
- Support for new SQL Server "Yukon" Datatypes – SQL Server "Yukon"
supports a set of new datatypes, including the ability to define columns typed
as UDT (user-defined types) or XML datatypes. Using SqlClient, UDT result
columns and parameter values can be passed as standard CLR objects. XML typed
columns are retrieved as XmlReaders, which can be used with other components
within System.Xml; e.g., to load data into an XPathNavigator.
- DataSet Remoting – One of the common observations in v1.0 was that
Datasets were relatively slow to remote using CLR
remoting when compared to other binary serialized representations. In "Whidbey",
a binary serialization format is now used for CLR remoting of DataSets,
which not only reduces the parsing time but also reduces the memory usage
and size of data transferred.
- Loading a DataSet from a DataReader – In ADO.NET version 1.0, DataSets
could be filled from a command using a DataAdapter,
loaded from XML, or populated by creating rows and filling in values. In "Whidbey",
DataSets support loading data from any DataReader.
Additionally, you can obtain a DataReader from the
DataSet. This use of DataReader
as a common streaming mechanism to/from the DataSet
enables pluggability of data components similar to how the XmlReader
enables pluggability of XML components.
- DataSet XML Support – The DataSet in "Whidbey"
has enhanced support for loading and saving XML. The DataSet
now supports loading XML with multiple in-line schemas interleaved with data,
loading schemas where the same element name exists in multiple different namespaces,
and loading/persisting XML from individual DataTables.
ADO.NET with Objectspaces
Use ObjectSpaces if you have a strong business logic layer.
Strengths
- Business level objects encapsulating behavior and data
- Relational mapping via metadata
- Decoupled from database schema
Usage Scenarios
- You are using an object-oriented design and development process
- You want to completely separate your business logic from your data access
logic
- You are working with an existing object model
- You model your data as objects
Objectspaces is ideal when you want to preserve your own object model creating
an object-orientated business logic layer over SQL Server with no need to inherit
from base classes and with no generated code to manage. The object to database
mapping layer isolates the object model from the relational data model and being
declarative can be re-used across multiple applications.
Most important, Objectspaces provides an object query capability through a
language called OPath that allows developers to be more productive by formulating
queries in terms of objects. With the ability to perform identity and change
tracking, the update operations are seamlessly performed to SQL Server by generating
SQL statements based upon the mapping metadata.
ADO.NET with SQLXML and System.Xml
Use SQLXML when you want Business to Business (B2B) interoperability by streaming,
shaping and aggregating XML via an XML schema over SQL Server.
Strengths
- Interoperability. Data interchange format for the Web – B2B, A2A
- Easily supports sparse (semi-structured) data – documents
- Standards support such as XQuery and XML Schema
- Relational mapping via metadata
- Decoupled from database schema
Usage Scenarios
- You have multiple data sources, not just relational data, e.g., aggregating
data from XML Web Services and XML documents
- Your applications evolve with a business schema as the contract, such as
vertical industry schemas, e.g., XBRL, RIXML, FinXML
- You are using data bound controls for XML data in Windows Forms and ASP.NET
applications
The SQLXML programming model provides the ability to query data sources exposed
as XML, which can then be manipulated with the XML programming model or shared
as a Web Service and then the changes can be updated back to the SQL Server.
The combination of XML Views and XQuery over SQL Server constitutes SQLXML in
ADO.NET, which shapes relational data as XML. Most important, XML documents
retrieved from business partners can be updated directly to SQL Server. This
provides the flexibility in that the document can contain missing or additional
data and the update will always "just work" without requiring any pre-processing,
providing it is mapped in the XML View. Equally the System.Xml
namespace provides an XML programming model for streaming, storing, querying
and transforming XML that works in conjunction with the SQLXML support in ADO.NET.
Design Considerations
By the virtue of their data model, the relational and object APIs (Objectspaces)
represent structured data and that means data that is surfaced from and sent to
relational databases or other structured data stores is defined by a tabular schema.
The guidelines as to when to use the relational APIs (DataSet
and DataReader) as opposed to Objectspaces typically
depend on the experience of the personnel in your organization.
If you have invested in developers who understand the SQL query language and
DBAs who are skilled in database design and you need the fastest possible access
and complete of control over your queries, then you are most likely to adopt
the relational API. Equally at the other end of the spectrum, the ADO.NET relational
APIs are ideally suited for building dynamic reporting style applications where
data is retrieved into a disconnected DataSet and bound
to a set of Window Forms or ASP.NET controls using the rich design time support
built into Visual Studio.
If on the other hand you are an organization that has invested in object-oriented
programmers who are skilled in the ability to model business rules as interactions
between objects with behavior in place of DBAs, then Objectspaces is your most
likely choice. Here a lead designer or system architect uses design-time tools
in Visual Studio to design the mapping from a relational table structure to
an object structure and sets in place a framework of classes for the application
development. A team of developers, who do not necessarily understand the intricacies
of the SQL query language, are then able to retrieve and update data implementing
the business rules via the object layer. Writing business rules encapsulated
with the data is the strength of Objectspaces. Whether using an object or a
relational API, in both cases the UI is either ASP.NET or Windows Forms utilizing
the data binding support.
XML is the data interchange format on the Web. The XML APIs with SQLXML come
into play when you need to send data stored in SQL Server to another application
with the flexibility and interoperability that the XML data model provides.
Documents provide a more flexible data structure by supporting sparse (semi-structure)
as well as structured data, e.g., a medical document may contain only some of
the patient information with additional notes that contain special text formatting.
The developers are skilled in XML query languages such as XPath to query the
data and in the authoring or use of XML schemas to determine the business to
business contract between the parties. However, the data sources are not restricted
to just SQL Server, with the ability to aggregate data from other sources such
as XML Web Services and XML documents. In other words, the common representation
in a Web-connected environment across all these data stores is XML. Relational
databases are just another data store in the web of data exchange with the ability
to surface their data as XML through SQLXML. The fact is that this is just data
as XML messages on the wire but, with explicit boundaries, is the basis of Service
Orientated Architectures (SOA).
|