Below are the topics, I am going to share about SQL Server Architecture:
2. SQL Server Network Interface (SNI)
- Database Transactions
- SQL Server Network Interface
- Client - Server Communication
- Server Engines
- Life cycle of a select(read) query
- Life cycle of an update(write) query
- Recovery models and intervals
- Dirty page, check point and lazy writer
A transaction is a unit of work in a
database that typically contains several commands that read from
and write to the database.
Type of transactions
Implicit
Implicit
transactions are used automatically by SQL Server to guarantee the ACID
properties of single
commands.
Explicit
Explicit
transactions are started by using the BEGIN TRANSACTION T-SQL command and are
stopped by
using the COMMIT TRANSACTION or ROLLBACK TRANSACTION commands.
ACID Properties
Atomicity
Atomicity
means that all
the effects of the transaction must complete successfully or the changes are rolled
back.
Consistency
The
consistency requirement ensures that the transaction cannot break the integrity
rules of the database; it must leave the database in a consistent state.
Isolation
Isolation
refers to keeping the changes of incomplete transactions running at the same
time separate from
one another.
Durability
Once
a transaction is committed, it must persist even if there is a system
failure that is, it must be durable.
Shared memory:
Simple and fast, shared memory is the default
protocol used to connect from
a client running on the same computer as
SQL Server.
TCP/IP:
TCP/IP is the most commonly used access protocol
for SQL Server. Default
TCP
port number 1433 for default instances or the SQL Browser service
will find the
right port for a named instance as 1434.
Named Pipes:
Named Pipes was developed for local area
networks (LANs) Named
Pipes
uses TCP port 445.
VIA:
Virtual Interface Adapter is a protocol that
enables high-performance communications
between
two
systems. It requires specialized hardware at both ends and a dedicated
connection.
3. Client - Server Communication
TDS (Tabular Data Stream):
TDS is a Microsoft-proprietary protocol.
Once
a connection has been made using a network protocol such as TCP/IP, a link
is established
to the relevant TDS endpoint that then acts as the communication point between the client
and
the server.The SELECT statement is sent to the SQL
Server as a TDS message across a TCP/IP connection.
Protocol Layer:
The protocol layer is responsible for
packaging up results and status messages to send back to the client as TDS messages.
Our SELECT
statement is marked in the TDS packet as a message of type “SQL Command,” so it’s
passed on to
the next component, the Query Parser, to begin the path toward execution.
4. Server Engines
5. Life cycle of a select(read) query:
8.Dirty page, check point and lazy writer
- Relational Engine
- Storage Engine
Relational Engine:
Command Parser
The Command Parser’s role is to handle
T-SQL language events. It first checks the syntax and
returns any
errors back to the protocol layer to send to the client.
Optimizer
The Optimizer is the most prized possession
of the SQL Server team and one of the most complex and secretive
parts of the product. It
is
what’s known as a “cost-based” optimizer, which means that it evaluates
multiple ways to
execute a query and then picks the method
that it deems will have the lowest cost to execute.
Query Executor
It executes the query
plan by working through each step it contains and interacting with the Storage
Engine to retrieve
or
modify data.
Storage Engine:
Access Methods
Access Methods is a collection of code
that provides the storage structures for data and indexes as well as the interface through which data
is retrieved and modified. It passes
the request to the Buffer
manager.
Transaction Manager
The Transaction Manager has two components:
- Lock Manager
- Log Manager
The Lock Manager is responsible for providing
concurrency to the data, and it delivers the configured level
of isolation
by using
locks.
Buffer Manager - Buffer Pool
Buffer pool is the largest consumer of memory in SQL
Server. The buffer pool contains all the different caches in SQL Server, including
the plan cache and the data cache.
Plan cache
The plan cache, part of SQL Server’s
buffer pool, is used to store execution plans in case they are needed
later.
Data cache
The sys.dm_os_buffer_descriptors script
is to see how much space each database is using in the data cache.
5. Life cycle of a select(read) query:
- The SQL Server Network Interface (SNI) from the client establish a connection to the SNI on the SQL Server using a network protocol such as TCP/IP.
- It then create a connection to a TDS endpoint over the TCP/IP connection and send the SELECT statement to SQL Server as a TDS message.
- The SNI on the SQL Server unpack the TDS message, read the SELECT statement, and passed a “SQL Command” to the Command Parser.
- The Command Parser checks the plan cache in the buffer pool for an existing, usable query plan. When it didn’t find, it creates a query tree based on the SELECT statement and pass it to the Optimizer to generate a query plan.
- The Optimizer generates a “trivial” plan in the pre-optimization phase.
- The query plan created was then passed to the Query Executor for execution.
- At execution time, the Query Executor determines that data needs to be read to complete the query plan so it passes through the request to the Access Methods in the Storage Engine via an OLE DB interface.
- The Access Methods need to read a page from the database to complete the request from the Query Executor and asks the Buffer Manager to provision the data page.
- The Buffer Manager checks the data cache to see if it already had the page in cache. It is not in cache so it pulls the page from disk, put it in cache, and pass it back to the Access Methods.
- Finally, the Access Methods pass the result set back to the Relational Engine to send to the client.
6. Life cycle of an update(write) query:
1. For update query the process is exactly the same as the process for the SELECT statement until #7,till the Access Methods.
2. Here the Access Methods need to make a data modification, so before it passes on the I/O request the details of the change need to be persisted to disk.
3. The Access Methods code requests that the changes it wants to make are logged, and the Log Manager writes the changes to the transaction log.
This process is known as Write-Ahead Logging.The update operation has now been logged.
4. The actual data modification can only be performed when confirmation is received that the operation has been physically written to the transaction log.
5. Once the Access Methods receives confirmation, it passes the modification request on to the Buffer Manager to complete.
6.The page that needs to be modified is already in cache, so all the Buffer Manager needs to do is modify the page as requested by the Access Methods.
7.The page is modified in the cache, and confirmation is sent back to Access Methods and finally to the client.
The important point here is that the UPDATE statement has changed the
data in the data cache, not in the actual database file on disk. This is done for performance reasons,and the page is now called a dirty page because it’s different in memory than it is on disk.
7. Recovery models and intervals
In the previous section the life cycle of an UPDATE query, we saw Write-Ahead Logging by which SQL Server maintains the durability of any changes.
Modifications are written to the transaction log first and are then actioned in memory. The process of recovering the changes from transaction log to the disk is known as recovery.
Recovery Models:
SQL Server has three database recovery models:
1. Full
2. bulk-logged and
3. simple.
Based on the recovery model we select affects the way our transaction log is used and how big it grows, backup strategy, and our restore options.
Full
All of the database operations fully logged in the transaction log and must have a backup strategy that includes full backups and transaction log backups.Starting with SQL Server 2005, Full backups don’t truncate the transaction log. It is because the sequence of transaction log backups is not broken and gives us an extra recovery option if our full backup is damaged.
Databases that require highest level of recoverability should use this Recovery Model.
Bulk-Logged
This recovery model is intended to be used temporarily to improve the
performance of certain bulk operations by minimally-logging them; all other operations are fully logged just like the full recovery model. This can improve performance because only the information required to roll back the transaction is logged. Redo information is not logged which means that
point-in-time-recovery is not possible.
These bulk operations include:
- BULK INSERT
- Using the bcp executable
- SELECT INTO
- CREATE INDEX
- ALTER INDEX REBUILD
- DROP INDEX
All committed transactions are truncated from the transaction log every time a checkpoint occurs. This ensures that the size of the log is kept to a minimum and that transaction log backups are not necessary. If the potential to lose all the changes since the last full or differential backup still meets the business requirements then simple recovery might be the way to go.
Recovery Interval
This is a server configuration option and it is used to influence the time between checkpoints. The time it takes to recover a database on startup is known as recovery interval.
By default the recovery interval is set to 0, which allows SQL Server to choose an appropriate interval,Changing this value greater than 0 represents the number of minutes we want to allow between
checkpoints.
8.Dirty page, check point and lazy writer
Dirty Pages
When a page is read from disk into memory
it is regarded as a clean page because it’s exactly the same
as its counterpart on the disk. However, once the page has been modified in
memory it is marked as a dirty page
Checkpoint
A checkpoint is a point in time created
by the checkpoint process at which SQL Server can be sure that any committed
transactions have had all their changes written to disk. This checkpoint then becomes
the marker from which database recovery can start.
Checkpoint occurs on database level.
To find when the checkpoint occur use undocumented function
select * from ::fn_dblog(null,null) WHERE [Operation] like '%CKPT'
Also enabling trace flag will provide information on error log when checkpoint started at what database.
DBCC TRACEON(3502, -1)
Related links:
http://msdn.microsoft.com/en-us/library/ms188748.aspx
http://support.microsoft.com/kb/815436
Checkpoint occurs on database level.
To find when the checkpoint occur use undocumented function
select * from ::fn_dblog(null,null) WHERE [Operation] like '%CKPT'
Also enabling trace flag will provide information on error log when checkpoint started at what database.
DBCC TRACEON(3502, -1)
Related links:
http://msdn.microsoft.com/en-us/library/ms188748.aspx
http://support.microsoft.com/kb/815436
Lazywriter
The lazywriter
is a thread that periodically checks the size of the free buffer list. When
it is low, it scans
the whole data cache to age-out any pages that haven’t been used for a while.
If it finds any dirty pages that haven’t been used for a while, they are
flushed to disk before being marked as free in memory.
Difference between Checkpoint and Lazywriter
Difference between Checkpoint and Lazywriter
No comments:
Post a Comment