- Notes - |+| FYI on this Guide. (') Single Quotes should only be used for 'Values', like in WHERE , VALUES, or SET statements Remove (') if dealing with Number Datatypes |+| ({)open, (})close Squiggly Bracets are references to {CustomNames} for Tables, Stored Procedures, etc. This is only for ease of reading, and SHOULD NOT be used in SQL, as it will break Syntax |+| -- Double Dashes are used for --Comments, just like SQL. Use and <3 them. I recommend reading this Guide in ConTEXT programming-friendly Editor. I use the Foxpro markup for color-coated reading Bliss - Data models: - Flat File: - Simple Data model - Hierarchial: - 1st Complex model - used on Mainframes, IE IBM's IMS - Relational - Mainsteam Data Model - Object Orientated - Use of Inheritance, etc. Applied to Hierarchial or Relational data model - High-End DB Mang. Sys. must offer: - Client-Server toplogy - Transaction processing - Large DB capasity (TB+) - Flat File DBs - Business rules must be enforced outside DB (IE Application) - Metadata = Data about Data - Schema is complete documentation of DB Metadata - Hierarchial DBs - Cons: - Must know entire hierarchy to access correctly - Can run poorly due to Pointer useage - Can be Difficult to maintain and update data for - Relational DBs - Tables called 'relations' - Term taken from Mathematical set theory - Term implies Table Relations, not relations between tables - Tables intended to house discrete entities in App System (IE have data nomalized) - Formal or Informal links between columns - ANSI-99: Compliance command standard. - In use for SQL 2000+ products from MS. - Also in use by most Relational DB Vendors - Client / Server Relationships: - 2-tier: - Processing and Data Logic done on server, while Client handles presenting data IE formatting. - 3-tier - Data Management done @ DB Server, Logic and formatting done on App Server, and there is also a Client Interface - N-tier: - Similar to 3-tier, but with the capability for multiple DB Servers or App Servers to distribute loads, before delivering to Client Interface. - Open-Source DBMS: - MySQL - PostgreSQL - High-end DBMS - MS SQL Sever - Oracle - IBM DB2 - Procedural Extensions: - T-SQL for MS SQL - PL/SQL for Oracle - Intended to eliminate use of proprietary extensions (IE xPlatform compliancy) - Structure Query Language (SQL) is Non-Procedural - Core SQL is standardized, Procedural Extensions are the Proprietary part - 3 Categories: - DDL (Data Definition Language) CREATE, DROP, ALTER - DCL (Data Control Language) GRANT, DENY, REVOKE - DML (Data Manipulation Language) SELECT, INSERT, UPDATE, DELETE DB Lifecycle - DB Design on Paper - App Dev uses Paper guide for DB Objects via DDL - App Development - After objects made, Add, Modfule and Delete processes made via DML - Administration - DB Security done @ Design time, and controlled via DCL - Result sets only reside in memory. - IE why making a Stored Proc or View helps for frequent joins, etc. - Surrogate keys - columns added to a table to uniquely identify individual rows via integer or other, simpler value. - Data APIs - Application programming interfaces - ODBC (Open Database Connectivity) - JDBC (Java Open Database Connectivity) - ADO (ActiveX Data Objects) - Network Protocols: - TCP/IP - IPX/SPX - Named Pipes - SQL99 Datatypes (Samples below) - CHARACTER(n) - (n) = length / Allowed characters |*| Common Datatype - VARCHAR(n) - NUMERIC(w,x) - w = digits to left of decimal, x = digits to right of decimal - DECIMAL(w,x) - INTEGER - Whole Number |*| Common Datatype - SMALLINT - FLOAT(x) - DATE - Y,M,D |*| Common Datatype - TIME - H,M,S - TIMESTAMP - Y,M,D,H,M,S - BINARY or BLOB - Binary large object for Non-character data (Pics, Audio, etc) - BOOLEAN - TRUE, FALSE - UNICODE - Stores data w/ 2 bytes - May allow NULL on most, if not all datatypes - DBMS vary in handling of NULL in some functions like SUM, MIN, AVG - To Auto Increment in MSSQL, Identity field handles - Can set seed (start value) and increment value - To Auto Incrment in Oracle, you must use Sequences, that must be set independent of a table - Composite or Multi-valued Dependency - 2 or more columns needed to make uniqueness for each record - Relationships - 1:M (One to Many) are most common types - 1:1 (One to One) Can be used to save space, especially for multiple entities in same table. Index links to details in alt table - M:N (Many to Many) Work in Logical Data Models, but must be implemented as 2 (1:M) relationships - Datatypes control type of data for Columns - E/R Diagram: Shows entities in DB and relationships they between entities - Metadata DB locations: - Master - Info about DB Server instance and all DBs on Server - Model - Template DB for any new Database made on server - TempDB - Holds Temp Tables - MSDB - Holds info on SQL Server Agent service - Extensions - .mdf - Primary Data File - .ndf - Secondary Data File - .ldf - Transaction log file - SQL Data Tool names and platforms - Query Analyzer - MS SQL - MS Query - MS Access - SQL Plus - Oracle - Data Blade - Informix - Sample T-SQL to make a New DB CREATE DATABASE Yar ON PRIMARY (NAME=Yar_Data, FILENAME='C:\DB-DumpPath\Yar.mdf', SIZE=10 mb, MAXSIZE=70 mb, FILEGROWTH=5 mb) LOG ON (NAME=Yar_log, FILENAME='C:\DB-DumpPath\Yar.ldf', SIZE=15 mb, MAXSIZE=10 mb, FILEGROWTH=2 mb) - Sample SQL for Table Create, with reference to *-* User-defined Datatype CREATE TABLE tblName (Table_ID int NOT NULL, ItemName varchar(25) NOT NULL, ItemPrice decimal(5,2) NULL, ItemStock CustomDatatype_Name) - Clsutered index - Orders the physical data - No Add'l storace space required - Max 1 Custered Index per Table - Non-clustered Index - User defined index that uses pointers - Uses Disk Space for indexes - Less effient then Clustered indexes (More I/O) - Sample build index code: CREATE ['Index Type'] [UNIQUE] [CLUSTERED] [NONCLUSTERED] INDEX index_name ON (desired_column [ASC |or| DESC]) - Constraints - Column restrictions: useful to ensure proper data in record - More efficient than Triggers, rules and default data - Type of Constraints - Setting a Primary or foreign key - Setting Default Values for a column - Validating input length - Requiring Unique Values - Forcing NOT NULL on Column - Sample Constraint - Set PKey: ALTER TABLE {tblName} ADD CONTRAINT {TableName_PKeyConst} PRIMARY KEY {ColumnName} - Set FKey: ALTER TABLE {tblName} ADD CONTRAINT {tblName_FKeyConst} FOREIGN KEY {ColumnName_inFKTable} REFERENCES {PK_TblName(PKColumn)} - Unique Constraints force Unique values - One NULL Allowed, default index is Non-Clustered - Set Unique constraint ALTER TABLE tblName ADD CONSTRAINT {unique_number} UNIQUE (fieldName) - Default Value - Default is bypassed by entering data - Set Default value ALTER TABLE tablename ADD CONSTRAINT const_name DEFAULT 'value' FOR column - Check Constraint - Validates inserted value matches validation criteria (IE No letters in Zip Code) - Sample Check Constraint to Validate 5 Digit # Only ALTER TABLE {tblName} ADD CONSTRAINT {field_constraintName} CHECK ({columnName} LIKE '[0-9][0-9][0-9][0-9][0-9]') - Constraints can be Overidden as well - To Override: ALTER TABLE {tblName} NOCHECK CONSTRAINT {field_constraintName} - To Remove Override: ALTER TABLE {tblName} CHECK CONSTRAINT {field_constraintName} - Rules work similar to Constraints, but work with user-defined Datatypes, instead of Columns - How to make a Rule - Make user-defined Datatype EXEC sp_addtype {Custom_DatatypeName}, {datatype} --IE:smallint - Create Rule CREATE RULE {CDTypeName_ruleDescription} AS @value>=0 and @value<=1000 - Bind Rule to Datatype EXEC sp_bindrule {CDTypeName_ruleDescription}, {CustomDatatypeName} - Any time the user-defined Datatype is used, Rule will be applied. - Views - Tables only sorted in Memory - Supported statements: SELECT, UPDATE, INSERT, JOIN, DELETE - Views inherit constraints from underlining source tables - Good for limiting returned Data and optimize Reporting - Some RDBMS allow Views to be Indexed - Sample View: CREATE VIEW {NameOfView} AS SELECT {Field1_Name}, {Field2_Info}, {Field3_Etc} FROM {SourceTable} - Security Authentication layers - Auth @ Server level - Auth @ DB level - Auth @ Object level - Auth @ Statement level - Access is defined Explicitly (directly) or Implicitly (inherited) by group membership - User can be limited what DBs to Access - Statements can be restricted - IE: CREATE, ALTER, DROP - Object Restrictions can also be Applied - IE: DELETE, EXECUTE, UPDATE, SELECT, INSERT - Sample Syntax for manging Permissions - Assign Statement Permission to User/Group GRANT CREATE DATABASE, CREATE TABLE TO Mark, Scott, NameOfGroup - Assign Object permissions to User/Group GRANT INSERT, UPDATE, DELETE ON {tblName} TO Mark, Scott, NameOfGroup - Deny Object permissions to User/Group DENY INSERT, UPDATE, DELETE ON {tblName} TO Mark, Scott, NameOfGroup - Revoking Permissions - If you already gave users Access to Object permissions via Group membership, the following will remove the Group membership, while the Group access will still work. REVOKE UPDATE ON {tblName} FROM Mark, Scott - ALTER command - to Add a column and allow NULL ALTER TABLE {tblName} ADD {newColumnName} datatype NULL --IE-Datatype:tinyint - to Delete an existing column ALTER TABLE {tblName} DROP COLUMN {ColumnName} - to Delete an existing Constraint ALTER TABLE {tblName} DROP CONSTRAINT {Constraint_Name} - DROP Command - Remove a Database DROP DATABASE {DBName} - Remove a Table DROP TABLE {tblName} - Remove an Index - Cannot remove index on PKey or UNIQUE constraint DROP INDEX {tblName.IndexNameforColumn} - Remove a View DROP VIEW {ViewName} - Backup Info - Make backups in corespondence to acceptable downtime (Bi-Daily, Weekly, etc) - When DB Structure changes, make a backup of Master and and System databases as well, since they hold critical data on User DBs - Frequently backup OLTP (real-time updated) DBs to retain Data with minimal Data loss. - Large DBs may backup more efficiently when set in Multiple Filegroups - INSERT (DML) - sample Simple Insert - If ColumnNames are not specified on INSERT line, defaults to fill columns in order INSERT {tblName} VALUES ('columnID_Data', 'nameOfItemColumn_Data', 'detailsColumn_Data', 'etc_Column_Data') - sample Target(ed) Insert INSERT {tblName} (columnID, nameOfItemColumn, etcColumn) VALUES ('columnID_Data', 'nameOfItemColumn_Data', 'etc_column_Data') - sample Insert from an existing table - To Copy Data to a new Table, you must 1st make the destination Table, with a similar column list (same order and datatypes) INSERT {tblName_BackupDestination} SELECT * (or defined, columns) FROM {tblName_SourceData} - SELECT (DML) - simple SELECT SELECT * FROM {tblName} - selecting specific Columns instead of all can speed up result set - custom Select SELECT columnID, nameOfItemColumn, etcColumn FROM {tblName} - Select can be used to take All or (filtered) column data into a new Table - User must have CREATE TABLE permission SELECT * INTO {tblName_BackupDestination} FROM {tblName_Source} - UPDATE (DML) - WHERE must be used, unless you want to set ALL data the same, for a specified column - sample Update with Where condition (not required if mass update) UPDATE tblName SET columnTargetName = 'dataValue' WHERE columnTargetName IS NULL - IS NULL can be replaced with Operators to 'criteriaValue' - or WHERE line meets query Criteria, --IE you want to set |WHERE 'columnTargetName' = 380|, because all entries for 380 were incorrect and need to be set = 'dataValue' - DELETE (DML) - If no WHERE clause, All rows in indicated table, will be deleted - This command does NOT Delete a table - sample Delete DELETE tblName WHERE columnName = 'criteriaValue' - UPDATE and DELETE Verification - Using a Select statement to be run with the Delete statement, will give you 2 Recordsets to Visually verify script was a Success. - sample Update with Select verification recordset SELECT * FROM tblName WHERE columnName = 'criteriaValue' UPDATE tblName SET columnName = 'dataValue' WHERE columnName = 'criteriaValue' - WHERE Clause - Optional clause of DML statements - Used for Filtering based off specified values - Datatypes and Filtering - String: All values must be surrounded by a 'single quote' - Numeric No Single Quotes used - Dates Varies by DB vendor - Operators - > Greater Than - < Less Than - <> Not Equal to - = Equal to - ! Negates Operators - can also use IS NULL or IS NOT NULL for Where filer - Operators can be used on Letters and Numbers - AND / OR usage - used alone or together to filter result set - used in the Where clause - compound via (parentheses) to group filter conditions SELECT * FROM {tblName} WHERE ({colName} = 'valueFilter1' OR {colName} = 'ValueFilter2') AND {colName_other} = 'value' - AND for extra criteria required - OR for either condition to be met - IN / BETWEEN usage - IN can be used to cover Multiple criteria, without needing OR clauses - sample of above Where clause, using IN versus OR SELECT * FROM {tblName} WHERE {colName} IN ('valueFilter1', 'ValueFilter2', 'ValueFilter3') - BETWEEN used to make an Inclsuive Range SELECT * FROM {tblName} WHERE {colName} BETWEEN 10 AND 20 - DISTINCT - used to eliminate (not display) duplicate values SELECT DISTINCT {colName} FROM {tblName} - LIKE - Matches Patterns. Used off WHERE clause - _ (Underscore) used to repesent 1 unknown string character - % (Percent) used to represent any number of unknown characters in string - [] (Square Brackets) used to find any Single character SELECT * FROM {tblName} WHERE {colName} LIKE '[IDKFA]%' - This Query will find ALL records starting with any of the letters in the brackets - NOT - Used to return records opposite of indicated condition - Same syntax as LIKE, can also be used as NOT LIKE SELECT * FROM {tblName} WHERE {colName} NOT LIKE '[IDKFA]' - This Query will find ALL records that DO NOT start with any of the letters in the brackets - Can also use ^ to indicate NOT in a LIKE Statement SELECT * FROM {tblName} WHERE {colName} LIKE '[^IDKFA]' - You can also define a Range in Brackets by using - (Minus) in Brackets SELECT * FROM {tblName} WHERE {colName} LIKE '[^0-9]' - Will not show records with Numbers as first character - Distributed Queries - Use Linked Servers for efficiency - 2 Types of Linked Server Queries: - Ad hoc - Does not take up DB space - Closes connection when Query done - uses OpenRowSet syntax to create temp table SELECT * FROM OPENROWSET('SQLOLEDB', '{DBServer}'; '{SQL_user}'; '{password}', --'sa' = typical SQL_user 'SELECT * FROM {DBName}.dbo.{tblOrObjectName}') - Permanent - Maintains connection for reuse - Uses standard Select statements once connection is made - Need to map logins to work on each server - need to create Linked Server in DBMS - defaul Owner is dbo - Require fully qualified path to tables - MS SQL Server: {SQL_LinkedServer}.{DatabaseName}.{Owner}.{tblOrObjectName} - Oracle: {OracleLinkedServer}..{Owner}.{tbl_Reference} - a Second . (dot) can be added to use defauls when building connection string - to add a linked server, sample below EXEC sp_addlinkedserver @server='{ServerName} -- \{instanceName} if dealing w/ SQL Express @provider='Microsoft.Jet.OLEDB.4.0' @srvproduct='OL DB Provider for Jet' @datasrc='\{nameOfDB_file.mdb}' - DB Design Phases - Logical Design should happen first and is multi-phased - Storage should be defined first - Have narrative of Logical Design meetings with App Devs and all involved teams - Underline Nouns from meeting make lists of Data Elements and possible Tables - Circle Verbs to make lists of Processes or Data Connections potential (relations) - Make Flow diagrams to aid with Data Processing - Once done mapping Data, flow and relations, obtain Confirmation functionality is meeting Objectives. - Logical Design Model creation - Defines core DB Logic - This is not yet bound to any specific DB vendor or app - Final phase of LD is Entity-Relationship (ER) Diagram, before moving to Physical DB creation - Physical DB creation - Defines Datatypes, file size, entity relationships - DB Vendor, Server design and Applications to be integrated come into effect. - Additonal changes to Logical Model may apply @ this point, due to Performance, Pshysical Constraints or Integrity enforcement - Make Statement of Scope to document Objective - Check current system to get ideas on Data Flow, also get user feedback on processes performed - Keep Logical Design centered around Scope of Project, not implementation details (just yet) - Define Relationships in DB - Also define entity requirements (Table dependency of Data) - Create DB Diagram (Logical Model) - Make visual diagram of entity relationships - Common Apps to do so: - ERWin, Visio for Enterprise Architechts, ER/Studio - Define Data Constraints - list Column Datatypes, Primary and Foreign Keys, Default values, checks, validation Rules and Indexes - Implement Design (Physical Model) - Define Hardware allocation, install DB Software - Use DDL to create Physical DB (Make Tables, Procedures, etc) - Make Views and Reports (Physical Model) - Define temp tables for larger datasets and/or reports - Logical Model Specifications - Entities - Persons, Places or Things you will be storing Data on - Relations - Similar Entities, each of which has a list of Attributes - Attributes - Info to be stored on Entities - Relationships - Associations between entities, like each person has an order (different tables and things) - Logical Model Elements - Logical |to| Physical Relations below - Relation | Table - Entitiy | Row - Attribute | Column - Relationship | Handles via PKeys and FKeys between tables -- Continue with Normalization Breakdown