Master SQL interviews with 50+ questions on queries, joins, optimization, and database design patterns.
10 Questions
~30 min read
INNER JOIN returns only matching rows from both tables. LEFT JOIN returns all rows from the left table plus matched rows from right (NULL for non-matches). RIGHT JOIN returns all rows from right table plus matched rows from left. FULL OUTER JOIN returns all rows from both tables. Choose based on which records you need to preserve.
WHERE filters rows before grouping (works on individual rows). HAVING filters groups after GROUP BY (works on aggregated results). Use WHERE for row-level conditions, HAVING for conditions on aggregates like COUNT, SUM, AVG. Example: WHERE salary > 50000 (row filter) vs HAVING COUNT(*) > 5 (group filter).
Indexes are data structures that speed up data retrieval by creating sorted references to table rows. Use indexes on: columns in WHERE clauses, JOIN conditions, ORDER BY columns, columns with high selectivity. Avoid over-indexing as they slow writes and use storage. Common types: B-tree (default), Hash, GiST, GIN.
Steps: (1) Use EXPLAIN/EXPLAIN ANALYZE to understand query plan, (2) Add appropriate indexes, (3) Avoid SELECT * - select only needed columns, (4) Optimize JOINs and subqueries, (5) Use LIMIT for large results, (6) Consider query rewriting, (7) Check for missing statistics, (8) Consider partitioning large tables, (9) Use connection pooling.
Clustered index determines physical order of data in the table - only one per table (usually primary key). Non-clustered indexes are separate structures pointing to data rows - multiple allowed. Clustered is faster for range queries; non-clustered is faster when you need specific rows. Choose clustered index based on most common access patterns.
ACID ensures reliable transactions: Atomicity (all or nothing - transaction fully completes or fully rolls back), Consistency (database moves from one valid state to another), Isolation (concurrent transactions don't interfere), Durability (committed transactions survive failures). Critical for financial systems, inventory management, and data integrity.
A deadlock occurs when two transactions each hold locks the other needs, causing both to wait forever. Prevention: access tables in consistent order, keep transactions short, use lower isolation levels when possible, use lock timeouts, avoid user interaction during transactions. Detection: databases automatically detect and roll back one transaction.
1NF: Atomic values, no repeating groups. 2NF: 1NF + no partial dependencies (non-key attributes depend on entire primary key). 3NF: 2NF + no transitive dependencies (non-key attributes depend only on primary key). BCNF: Every determinant is a candidate key. Normalization reduces redundancy; sometimes denormalize for read performance.
UNION combines result sets and removes duplicates (requires sorting/hashing). UNION ALL combines result sets keeping all rows including duplicates (faster). Use UNION when you need unique results; use UNION ALL when duplicates are acceptable or impossible, for better performance.
Window functions perform calculations across related rows without collapsing them (unlike GROUP BY). Syntax: function() OVER (PARTITION BY col ORDER BY col). Examples: ROW_NUMBER() for ranking, LAG/LEAD for previous/next values, SUM() OVER for running totals, RANK/DENSE_RANK for rankings with ties. Essential for analytics and reporting.