Q1. Tell me something about the SQL Server Architecture?
SQL Server is divided into two main engines: Relational
Engine and Storage Engine.
Relational Engine components: Cmd Parser, Optimizer, Query
Executor
Storage Engine components: Access Methods code, Buffer
Manager, Transaction Manager
Q2. What is Relational Engine and its Role?
The Relational Engine is also sometimes called the query
processor because its primary function is query optimization and execution.
The main responsibilities of the relational engine are:
Parsing the SQL statements.
The parser scans an SQL statement and breaks it down into
the logical units, such as keywords, parameters, operators, and identifiers.
The parser also breaks down the overall SQL statement into a series of smaller
logical operations.
Optimizing the execution plans.
Typically, there are many ways that the server could use
data from the source tables to build the result set. The query optimizer
determines what these various series of steps are, estimates the cost of each
series (primarily in terms of file I/O), and chooses the series of steps that
has the lowest cost. It then combines the specific steps with the query tree to
produce an optimized execution plan.
Executing the series of logical operations defined in the
execution plan.
After the query optimizer has defined the logical operations
required to complete a statement, the relational engine steps through these
operations in the sequence specified in the optimized execution plan.
Processing Data Definition Language (DDL) and other
statements.
These statements are not the typical SELECT, INSERT, UPDATE,
or DELETE statements; these statements have special processing needs. Examples
are the SET statements to set connection options, and the CREATE statements to
create objects in a database.
Formatting results.
The relational engine formats the results returned to the
client. The results are formatted as either a traditional, tabular result set
or as an XML document. The results are then encapsulated in one or more TDS
packets and returned to the application.
Q3. What is Storage Engine and its Role?
The Storage Engine is responsible for managing all I/O to
the data. The main responsibilities of the storage engine include:
·
Managing the files on which the database is
stored and managing the use of space in the files.
·
Building and reading the physical pages used to
store data.
·
Managing the data buffers and all I/O to the
physical files.
·
Controlling concurrency. Managing transactions
and using locking to control concurrent user access to rows in the database.
·
Logging and recovery.
·
Implementing utility functions such as the
BACKUP, RESTORE, and DBCC statements and bulk copy.
Q4. What is SNI
Protocol Layer?
SQL Server Network Interface (SNI) is a protocol layer that
establishes the network connection between the client and the server. It
consists of a set of APIs that are used by both the database engine and the SQL
Server Native Client (SNAC). SQL Server has support for the following
protocols:
·
Shared memory
·
TCP/IP
·
Named Pipes
·
VIA — Virtual Interface Adapter
Q5. What are Tabular Data Stream (TDS) Endpoints?
TDS is a Microsoft-proprietary protocol originally designed
by Sybase that is used to interact with a database server. Once a connection
has been made using a network protocol such as TCP/IP, a link is established to
the relevant TDS endpoint that then acts as the communication point between the
client and the server.
Q6. What is a Command Parser?
The Command Parser’s role is to handle T-SQL language
events. It first checks the syntax and returns any errors back to the protocol
layer to send to the client. If the syntax is valid, then the next step is to
generate a query plan or find an existing plan. A query plan contains the
details about how SQL Server is going to execute a piece of code. It is
commonly referred to as an execution plan.
To check for a query plan, the Command Parser generates a
hash of the T-SQL and checks it against the plan cache to determine
whether a suitable plan already exists. The plan cache is an area in the buffer
pool used to cache query plans. If it finds a match, then the plan is read from
cache and passed on to the Query Executor for execution. Otherwise an Execution
plan is created by the optimizer.
Q7.What is an Execution Plan?
An execution plan is composed of primitive operations.
Examples of primitive operations are: reading a table completely, using an
index, performing a nested loop or a hash join. All primitive operations have
an output: their result set. Some, like the nested loop, have one input. Other,
like the hash join, has two inputs. Each input should be connected to the
output of another primitive operation. That’s why an execution plan can be
sketched as a tree: information flows from leaves to the root.
Q8. What is a Plan Cache?
Plan cache is the part of SQL Server’s buffer pool, is used
to store execution plans in case they are needed later when the same type of
scripts are submitted by the users.
Q9. What is the role of an Optimizer?
The Optimizer is one of the important assets of a database
engine. This is the component on which a particular RDBMS stands off. The
primary function of the optimizer is to generate execution plan.
Q10. What is Query Executor?
The Query Executor’s job is self-explanatory; it executes
the query. To be more specific, it executes the query plan by working through
each step it contains and interacting with the Storage Engine to retrieve or
modify data.
Q11. What are Access methods and its roles?
Access Methods is a collection of code that provides the
storage structures for your data and indexes, as well as the interface through
which data is retrieved and modified. It contains all the code to retrieve data
but it doesn’t actually perform the operation itself; it passes the request to
the Buffer Manager.
Suppose our SELECT statement needs to read just a few rows
that are all on a single page. The Access Methods code will ask the Buffer
Manager to retrieve the page so that it can prepare an OLE DB rowset to pass
back to the Relational Engine.
Q12. What is a Buffer Manager?
The buffer management component consists of two mechanisms:
the buffer manager to access and update database pages, and the buffer cache
(also called the buffer pool), to reduce database file I/O.
The Buffer Manager, as its name suggests, manages the buffer
pool, which represents the majority of SQL Server’s memory usage. If you need
to read some rows from a page, the Buffer Manager checks the data cache in the
buffer pool to see if it already has the page cached in memory. If the page is
already cached, then the results are passed back to the Access Methods.
If the page isn’t already in cache, then the Buffer Manager
gets the page from the database on disk, puts it in the data cache, and passes
the results to the Access Methods.
Q13. What is a Buffer pool? What is the importance of
Data cache?
Buffer Pool consist of various type of cache like data
cache, plan cache, log cache etc. Here data cache is the very important part of
buffer pool which is used to store the various types of pages to serve
particular query. Suppose if we run a particular select query on a table to
show all data rows of that table. Then all the data pages of that table will be
required to fulfill the requirement of this query. Here first all data pages
will move from disk to buffer pool. This operation of reading data pages from
disk to memory is known as physical IO. But if we running the same query again
then there is no need to read data pages from disk to buffer pool because all
the data pages are already in buffer pool. This operation is known as Logical
IO.
Q14. What is the Data cache?
The data cache is usually the largest part of the buffer
pool; therefore, it’s the largest memory consumer within SQL Server. It is here
that every data page that is read from disk is written to before being used.
The sys.dm_os_buffer_descriptors DMV contains one row for
every data page currently held in cache. You can use this script to see how
much space each database is using in the data cache:
SELECT count(*)*8/1024 AS 'Cached Size (MB)' ,CASE database_id WHEN 32767 THEN 'ResourceDb' ELSE
db_name(database_id) END AS 'Database'
FROM sys.dm_os_buffer_descriptors
GROUP BY db_name(database_id),database_id ORDER BY 'Cached
Size (MB)' DESC
Q15. What is a Transaction manager and its role?
Transaction Manager interacts with the Access Methods and
has two components thorugh which it works on the transactions.
Lock Manager: It is responsible for providing
concurrency to the data, and it delivers the configured level of isolation by
using locks.
Log Manager: It writes the changes to the transaction
log. Writing to the transaction log is the only part of a data modification
transaction that always needs a physical write to disk because SQL Server
depends on being able to reread that change in the event of system failure
Q16. What is Write Ahead Logging?
At the time a modification is made to a page in the buffer,
a log record is built in the log cache recording the modification. This log
record must be written to disk before the associated
dirty page is flushed from the buffer cache to disk. SQL
Server has logic that prevents a dirty page from being flushed before the
associated log record. Because log records are always written ahead of the
associated data pages, the process is called a write-ahead logging.
Q17. What are dirty pages?
When a page is read from disk into memory it is regarded as
a clean page because it’s exactly the same as its counterpart on the disk.
However, once the page has been modified in memory it is marked as a dirty
page.
A dirty page is simply a page that has changed in memory
since it was loaded from disk and is now different from the on-disk page.
Q18. Which DMV can be used to check how many dirty pages
exists in the memory for each database?
SELECT db_name(database_id) AS 'Database',count(page_id) AS
'Dirty Pages' FROM sys.dm_os_buffer_descriptors WHERE is_modified =1
GROUP BY db_name(database_id)ORDER BY count(page_id) DESC
Q19. How is the dirty page written
to disk?
Dirty pages are written to disk on the following events.
Lazy writing is a process to move pages containing changes
from the buffer onto disk. This clears the buffers for us by other pages.
Checkpoint writes all dirty pages to disk. SQL Server
periodically commits a CHECKPOINT to ensure all dirty pages are flushed to
disk.
Explicitly issuing a CHECKPOINT will force a checkpoint
Examples of events causing a CHECKPOINT
•
net stop mssqlserver
•
SHUTDOWN
•
ALTER DATABASE adding a file
Eager writing – Nonlogged bcp, SELECT INTO,
WRITETEXT,UPDATETEXT,BULK INSERT are examples of non-logged operations. To
speed up the tasks , eager writing manages
page creation and page writing in parallel. The requestor does not need
to wait for all the page creation to occur prior to writing pages
Q20. What is a check point?
A checkpoint is a point in time created by the checkpoint
process at which SQL Server can be sure that any committed transactions have
had all their changes written to disk. This checkpoint then becomes the marker
from which database recovery can start. The checkpoint process ensures that any
dirty pages associated with a committed transaction are flushed to disk.
Q21. What is the frequency of checkpoint in an ideal
scenario?
The Database Engine supports several types of checkpoints:
automatic, indirect, manual, and internal. The following table summarizes the
types of checkpoints.
a. Automatic
Transact-SQL Interface
EXEC sp_configure'recovery interval','seconds'
Description: Issued automatically in the background to meet
the upper time limit suggested by the recovery interval server configuration
option. Automatic checkpoints run to completion. Automatic checkpoints are
throttled based on the number of outstanding writes and whether the Database
Engine detects an increase in write latency above 20 milliseconds.
b. Indirect
Transact-SQL Interface
ALTER DATABASE … SET TARGET_RECOVERY_TIME
=target_recovery_time{ SECONDS | MINUTES }
Description Issued in the background to meet a user-specified
target recovery time for a given database. The default target recovery time is
0, which causes automatic checkpoint heuristics to be used on the database. If
you have used ALTER DATABASE to set TARGET_RECOVERY_TIME to >0, this value
is used, rather than the recovery interval specified for the server instance.
c. Manual
Transact-SQL Interface
CHECKPOINT [ checkpoint_duration ]
Description Issued when you execute a Transact-SQL
CHECKPOINT command. The manual checkpoint occurs in the current database for
your connection. By default, manual checkpoints run to completion. Throttling
works the same way as for automatic checkpoints. Optionally, the
checkpoint_duration parameter specifies a requested amount of time, in seconds,
for the checkpoint to complete.
d. Internal
Transact-SQL Interface
None.
Description Issued by various server operations such as
backup and database-snapshot creation to guarantee that disk images match the
current state of the log.
22. What is
LazyWriter?
Lazywriter also flushes dirty pages to disk. SQL Server
constantly monitors memory usage to assess resource contention (or
availability); It’s job is to make sure that there is a certain amount of free
space available at all times. As part of this process, when it notices any such
resource contention, it triggers LazyWriter to free up some pages in memory by
writing out dirty pages to disk. It employs Least Recently Used (LRU) algorithm
to decide which pages are to be flushed to the disk.
23. What is log flush?
Log Flush also writes pages to disk. The difference here is
that it writes pages from Log Cache into the Transactional log file (LDF). Once
a transaction completes, LogFlush writes those pages (from Log Cache) to LDF
file on disk.
Each and every transaction that results in data page
changes, also incurs some Log Cache changes. At the end of each transaction
(commit), these changes from Log Cache are flushed down to the physical file
(LDF).
24. What is the difference between check point lazy
writer?
Checkpoint
|
Lazy writer
|
Checkpoint is used by sql engine to keep database recovery time in
check
|
Lazy writer is used by SQL engine only to make sure there is enough
memory left in sql buffer pool to accommodate new pages
|
Check point always mark entry in T-log before it executes either sql
engine or manually
|
Lazy writer doesn’t mark any entry in T-log
|
To check occurrence of checkpoint , we can use below queryselect *
from ::fn_dblog(null,null) WHERE [Operation] like ‘%CKPT’
|
To check occurrence of lazy writer we can use performance monitor
|
SQL Server Buffer Manager Lazy writes/sec
|
|
Checkpoint only check if page is dirty or not
|
Lazy writer clears any page from memory when it satisfies all of 3
conditions. 1.Memory is required by any object and available memory is full
2.Cost factor of page is zero
3.Page is not currently reference by any connection
|
Checkpoint is affected by two parameters
1.Checkpoint duration: is how long the checkpoint can run for.
2.Recovery interval: affects how often it runs.
|
Lazy writer is affected by
1.Memory pressure
2.Reference counter of page in memory
|
Check point should not be very low , it can cause increasing recovery
time of database
|
No. of times lazy writer is executing per second should always be low
else it will show memory pressure
|
Checkpoint will run as per defined frequency
|
No memory pressure, no lazy writer
|
Checkpoint tries to write as many pages as fast as possible
|
Lazy writer tries to write as few as necessary
|
checkpoint process does not put the buffer page back on the free list
|
Lazy writer scans the buffer cache and reclaim unused pages and put it
n free list
|
We can find last run entry of checkpoint in Boot page
|
Lazy writer doesn’t update boot page
|
Checkpoint can be executed by user manually or by SQL engine
|
Lazy writer cant be controlled by user
|
It keeps no. of dirty pages in memory to minimum
|
It helps to reduce paging
|
Auto frequency can be controlled using recovery interval in
sp_configure
|
Works only @ memory pressure , It uses clock algorithm for cleaning
buffer cache
|
It will be automatically executed before every sql statement which
requires consistent view of database to perform task like (Alter, backup,
checkdb, snapshot …..)
|
It kicks pages out of memory when reference counter of page reaches to
zero
|
Command : Checkpoint
|
No command available
|
It comes in picture to find min lsn whenever t-log truncates
|
No entry in T-log
|
Checkpoint is affected by Database recovery model
|
Lazy writer doesn’t get impacted with recovery model of database
|
To get checkpoint entry in error log DBCC TRACEON(3502, -1)
|
Not Applied
|
Members of the SYSADMIN, DB_OWNER and DB_BACKUPOPERATOR can execute
checkpoint manually
|
Not Applied
|
25. What are ghost records in SQL server?
When a record is deleted from a clustered index data page or
non-clustered index leaf page or a versioned heap page or a forwarded record is
recalled, the record is logically removed by marking them as deleted but not
physically removed from the page immediately. Pages which are marked as deleted
but actually not deleted physically are called Ghost Records.
26. Which process removes the records which are marked as
Ghost Records?
Ghostcleanuptask: SQL Server Ghostcleanuptask thread
physically removes the records which are marked as deleted.
27. How Ghost cleanup task works?
Ghostcleanuptask thread wakes up every 10 seconds.
Sweepdatabases one by one starting from master.
Skip the database if it is not able to take ashared lock for
database (LCK_M_S) or database is not in Open read/write state.
Scans the PFS pages of the current database to get the pages
which has ghost records.
PFS Page:A PFS page occurs once in 8088 pages. SSQL Server
will attempt to place a PFS page on the first page of every PFS
interval(8088Pages). The only time a PFS page is not the first page in its
interval is in the first interval for a file.
In this case, the file header page is first, and the PFS
page is second. (Page ID starts from 0 so the first PFS page is at Page ID 1)
Remove the records which are marked as deleted (ghosted)
physically
28. What is the different protocol supported by SQL
server, explain each of these?
Shared memory — Simple and fast, shared memory is the
default protocol used to connect from a client running on the same computer as
SQL Server. It can only be used locally, has no configurable properties, and is
always tried first when connecting from the local machine.
TCP/IP — This is the most commonly used access
protocol for SQL Server. It enables you to connect to SQL Server by specifying
an IP address and a port number. Typically, this happens automatically when you
specify an instance to connect to. Your internal name resolution system
resolves the hostname part of the instance name to an IP address, and either
you connect to the default TCP port number 1433 for default instances or the
SQL Browser service will find the right port for a named instance using UDP port
1434.
Named Pipes — TCP/IP and Named Pipes are comparable
protocols in the architectures in which they can be used. Named Pipes was
developed for local area networks (LANs) but it can be inefficient across
slower networks such as wide area networks (WANs).
VIA — Virtual Interface Adapter is a protocol that
enables high-performance communications between two systems. It requires
specialized hardware at both ends and a dedicated connection.
29. What is HOT ADD CPU term in SQL server?
‘Hot ADD’ means being able to plug in a CPU while the
machine is running and then reconfigure SQL Server to make use of the CPU
ONLINE! (i.e. no application downtime required at all)
There are a few restrictions:
Need a 64-bit system that support hot-add CPU (obviously :-))
Need Enterprise Edition of SQL Server 2008
Need Windows Server Datacenter or Enterprise Edition
30. What is MaxDOP term in SQL server?
When SQL Server runs on a computer with more than one
processor or CPU, it detects the best degree of parallelism that is the number
of processors employed to run a single statement, for each query that has a
parallel execution plan. You can use the max degree of parallelism option to
limit the number of processors to use for parallel plan execution and to
prevent run-away queries from impacting SQL Server performance by using all
available CPUs.
Q. What is a Batch, Task, Windows Thread, Fiber, Worker
Thread in SQL Server OS architecture?
Batch
An SQL batch is a set of one or more Transact-SQL statements
sent from a client to an instance of SQL Server for execution. It represents a
unit of work submitted to the Database Engine by users.
Task
A task represents a unit of work that is scheduled by SQL
server. A batch can map to one or more tasks. For example, a parallel query
will be executed by multiple tasks.
Windows Thread
A windows thread represents an independent execution
mechanism.
Fiber
A fiber is lightweight thread that queries fewer resources
than a windows thread and can switch context when in user mode. One Windows
thread can be mapped to many fibers.
Worker Thread
The worker thread represents a logical thread in SQL Server
that is internally mapped (1:1) to either a windows thread or, if lightweight
pooling is turned ON, to a fiber. The mapping is maintained until worker thread
is deallocated either because of memory pressure, or if it has been idle for
long time. The association task to a worker thread is maintained for the life
of the task.
No comments:
Post a Comment