Data File and Transaction Log File Architecture
1) Explain the SQL Server Database Data File structure?
SQL server divides the data file into 8 KB pages and page is
the smallest unit of any IO operation. SQL server refer the first page in the
data file as page number 0.In all the data file first 9 pages (till page number 8) are in same order
as shown below. In the primary data file, the 10th page (Page number 9) will be
the boot page which stores the metadata about the database.
2) What is a Data Page?
Data rows with all data, except text, ntext, image,
nvarchar(max), varchar(max), varbinary(max), and xml data.
3) What is an Index Page?
Contains information related to Indexes. This holds index
records in the upper levels of a clustered index and all levels of
non-clustered indexes.
4) What is a Page Header?
Page number 0- the file header (page type 15). It holds the
file header information. There will be only one header page for each data file
and that reside in the 0th position. The header page store the file level
information like file size, minimum size, max size and file growth.
5) What is PFS?
PFS page is the second page (Page Number 1) in the data file
followed by file header (Page Number 0). PFS pages are used to track page level
allocation. PFS page repeat after every 8088 pages
6) What is a GAM Page?
GAM pages records what extents have been allocated for any
use. GAM has bit for every extent. If the bit is 1, the corresponding extent is
free, if the bit is 0, the corresponding extent is in use as uniform or mixed
extent.
7) How many GAM pages will be there in a 7GB data file?
A GAM page can hold information of around 64000 extents.
That is, a GAM page can hold information of (64000X8X8)/1024 = 4000 MB
approximately. In short, a data file of size 7 GB will have two GAM pages.
8) What is SGAM page?
SGAM (Shared Global Allocation Map) page (Page type 8). SGAM
pages record what extents are currently being used as mixed extent and also
have at least one unused page. SGAM has bit for every extent. If the bit is 1,
the corresponding extent is used as a mixed extent and has at least one page
free to allocate. If the bit is 0, the extent is either not used as a mixed
extent or it is mixed extent and with all its pages being used.
9) How many SGAM pages will be there in a 7GB data file?
A SGAM page can hold information of 64000 extents. That is,
a SGAM page can hold information of (64000X8X8)/1024 = 4000 MB. In short, a
data file of size 7 GB will have two SGAM page.
10) What is a DCM and its significance?
SQL Server uses DCM pages to track extent modified after the
last full backup operation. DCM pages track the extents modified after the full
backup. DCM has a bit for every extent it tracks. If the bit is 1, the
corresponding extent is modified after the last full backup, if the bit is 0,
there is no change in the corresponding extent after the last full backup
11) How SQL Server engine will decide which extents need
to be added into a differential backup?
A differential backup process scans through DCP page to
identify the extents which are modified after the last full backup and add
those extents in the Differential backup.
12) What is a BCM and its significance?
This tracks the extents that have been modified by bulk
logged operations since the last BACKUP LOG statement. If the bit for an extent
is 1, the extent has been modified by a bulk logged operation after the last
BACKUP LOG statement. If the bit is 0, the extent has not been modified by bulk
logged operations.
Although BCM pages appear in all databases, they are only
relevant when the database is using the bulk-logged recovery model. In this
recovery model, when a BACKUP LOG is performed, the backup process scans the
BCMs for extents that have been modified. It then includes those extents in the
log backup.
13) What is Boot Page?
There is one special data page that exists only once per
database. It is the database boot page. The database boot page is always stored
in page 9 of file 1, the first file in the primary file group.
The database boot page contains a broad variety of data
about the database itself. Here you can find the internal database version, the
compatibility level, the database name and the date of the last log backup.
14) How many types of Allocation units are there?
There are three types of allocation units.
IN_ROW_DATA (default): If a table is relatively simple in
desin (meaning record size is less than 8000 and no LOB data types are
present), all records are stored in a pages refered as IN_ROW_DATA pages.
ROW_OVERFLOW_DATA: assume that a table is created with
record size 12000 bytes having 4 varchar data types of size 4000 bytes.
Whenever user inserts a record with size greater than 8000 (page size is 8K),
then the excess data is moved to ROW_OVERFLOW_DATA pages. In simple terms,
ROW_OVERFLOW_DATA pages will come in to
picture only when the row size exceed page maximum limit.
LOB_DATA: LOB data like text not stored along with data
pages. LOB data is stored in pecial pages called LOB_DATA pages. 16 byte
pointer in data page will be used to refer LOB_DATA page.
15) What is an IAM page?
IAM stands for Index Allocation Map: To catalog all the
pages that belong to a single allocation unit, SQL Server utilizes a special
type of pages, the Index Allocation Map or IAM pages. Each allocation unit has
at least one IAM Page; depending on the size of the table there might be many.
The page type of an IAM page is 10.
16) What is a ROW OFFSET and its importance?
Page header stored system information about the page. After
the page header data row is stored serially. Row offset information is stored
at the end of the page, after the data rows. Every data row has a row offset.
Row offset stores information about how far the row is from the start of the
page.
17) What are LOB pages?
SQL Server stores all data in 8192-byte sized blocks called
pages. Several types of pages are in use within a typical database. One
particularly interesting group is formed by the type-3 pages or Large Object
Pages.
LOB_DATA Allocation Units
Most data types in SQL Server take up no more than 8000
bytes of storage. However, there are a few data types, which allow for larger
pieces of information to be stored. Examples include the VARCHAR(MAX),
VARBINARY(MAX) or XML data types.
Normal data pages that belong to a table are grouped in
IN_ROW_DATA allocation units. However, if a value that is larger than 8000
bytes needs to be stored, SQL Server does not attempt to store it in those data
pages anymore. It does not even store those values in the same allocation unit.
Instead, Large Object data or LOB data is stored in special LOB_DATA allocation
units.
18) What is a PAGE SPLIT?
Page splits are performed when there is no space in a data
page to accommodate more inserts or updates. In a page spilt, SQL server
removes some data from the current data page and moves it to another data page.
This is done in order to make space for the new rows.
19) How Page Splits impacts the database performance?
Too many page splits can decrease the performance of the SQL
server because of the large number of I/O operations.
Following remedies can be taken to avoid too many page
splits:-
·
Rebuild indexes frequently to empty the fill
factor space for more data.
·
Increased the Fill factor value after observing
the page split behavior
20) How can we monitor or detect Page Splits?
Number of page splits can be observed by using the
Performance Monitor and watch the SQLServer:Access Methods:Page Splits/sec
counter
21) What is a Fill Factor?
The fill-factor option is provided for fine-tuning index
data storage and performance. When an index is created or rebuilt, the
fill-factor value determines the percentage of space on each leaf-level page to
be filled with data, reserving the remainder on each page as free space for
future growth. For example, specifying a fill-factor value of 80 means that 20
percent of each leaf-level page will be left empty, providing space for index
expansion as data is added to the underlying table. The empty space is reserved
between the index rows rather than at the end of the index.
22) What is a Logical Read?
Logical read indicates total number of data pages that are
found in the data cache and read from Data cache without reading anything from
Disk.
23) What is a Physical read?
Physical read indicates total number of data pages that are
read from disk. In case no data in data cache, the physical read will be equal
to number of logical read. And usually it happens for first query request. And
for subsequent same query request the number will be substantially decreased
because the data pages have been in data cache.
24) How can we view the content of a page?
To see the row data stored in that page , we have to use the
DBCC PAGE command. The syntax of DBCC PAGE
dbcc page ( {'dbname' | dbid}, filenum, pagenum [,
printopt={0|1|2|3} ]);
Print opt:
·
0 – print just the page header
·
1 – page header plus per-row hex dumps and a
dump of the page slot array
·
2 – page header plus whole page hex dump
·
3 – page header plus detailed per-row
interpretation
DBCC TRACEON(3604) GO
DBCC
page('test',1,114,3)
25) What is a
Transaction log file?
A transaction log is a physical file in which SQL server
stores the details of all transactions and data modifications performed on the
database. In the event of disaster, that
causes SQL server to shutdown unexpectedly (Power failure/hardware failure),
the transaction log is used to bring the database in a consistent state while
restarting the server. On restarting the server, database goes through the
recovery process.
During this recovery process, the transaction log is used to
make sure that all committed transactions are written to respective data file
(rolled forward) and rollback the uncommitted transaction.
Logically transaction log is a set of log records. Each
record is identified by a log sequence number (LSN). The new log records are
always written at the logical end of log file with a LSN which is greater than
the previous one
26) What are Virtual Log files? How these are created?
SQL Server Database Engine divides each physical log file
internally into a number of virtual log files. Virtual log files have no fixed
size, and there is no fixed number of virtual log files for a physical log
file. The Database Engine chooses the size of the virtual log files dynamically
while it is creating or extending log files. The Database Engine tries to
maintain a small number of virtual files. The size of the virtual files after a
log file has been extended is the sum of the size of the existing log and the
size of the new file increment. The size or number of virtual log files cannot
be configured or set by administrators.
27) How VLFs are created SQL Server in a Transaction log
file?
SQL server decides the size and number of VLF in a log file
based on the size of the log file growth as given below.
Growth upto 64 MB
= 4 VLF
From 64 MB to 1 GB
= 8 VLF
Larger than 1 GB
= 16 VLF
28) What is the way to check Virtual Log file details? DBCC loginfo(‘mydb’)
29) What is Log Sequence No?
Every record in the SQL Server transaction log is uniquely
identified by a log sequence number (LSN). LSNs are ordered such that if LSN2
is greater than LSN1, the change described by the log record referred to by
LSN2 occurred after the change described by the log record LSN.
30) What is a Log Truncation?
Log truncation is essential to keep the log from filling.
Log truncation deletes inactive virtual log files from the logical transaction
log of a SQL Server database, freeing space in the logical log for reuse by the
physical transaction log. If a transaction log were never truncated, it would
eventually fill all the disk space that is allocated to its physical log files.
However, before the log can be truncated, a checkpoint operation must occur. A
checkpoint writes the current in-memory modified pages (known as dirty pages)
and transaction log information from memory to disk. When the checkpoint is
performed, the inactive portion of the transaction log is marked as reusable.
Thereafter, the inactive portion can be freed by log truncation.
31) How Does Log Truncation occurs?
Log truncation occurs automatically after the following
events, except when delayed for some reason:
·
Under the simple recovery model, after a checkpoint.
·
Under the full recovery model or bulk-logged
recovery model, after a log backup, if a checkpoint has occurred since the
previous backup.
No comments:
Post a Comment