Showing posts with label Database. Show all posts
Showing posts with label Database. Show all posts

Saturday, June 14, 2025

SQL Exercise From Brent Ozar

This is coming from:  https://www.brentozar.com/archive/2025/06/query-exercise-return-routes-in-the-right-order/

I started this exercise when Brent first posted it, but misread the directions at first and accidentally got most of the right answer, but in the wrong way. I wasn't really selecting the true next Station, I was just iterating through them all and picking up the one that should be next. So I set it aside and forgot about it.

Now that all the answers are out, I was reminded that I needed to finish. Before looking at the answers, I went back to try my hand again. I went a recursive CTE route, but I was unhappy with the Query Plan. I did get the right answer, but it looked gross to me. However, it worked, and it was fewer than 20 lines, so I wasn't too unhappy. It did take me much longer to noodle through than it should have: about 2 hours, going down a couple of wrong paths before I finally found one I was happy with.

; WITH StationList AS (
  ---- GENERATE THE LIST WITH THE NextStationName.
  SELECT s.StationID
     , s.StationName
     , s.StationPhysicalOrder
    , COALESCE( sro.StationToName, LEAD(s.StationName) OVER (ORDER BY s.StationPhysicalOrder) ) AS NextStationName
     ---- If there's an Override Station, use that. Otherwise use regular Station.
  FROM Stations s
  LEFT OUTER JOIN StationRoutingOverride sro ON sro.StationFromName = s.StationName
)
, NO2 AS (
  ---- RECURSIVE CTE TO GET NEXT STATION
  SELECT sl.StationName, sl.NextStationName, rn=1
  FROM StationList sl WHERE sl.StationName = 'A'
  ---- sl.StationName can be changed to start from any station.

  UNION ALL

  SELECT sl2.StationName, sl2.NextStationName, rn+1 AS rn
  FROM StationList sl2
  INNER JOIN NO2 ON no2.NextStationName = sl2.StationName
  ---- Last StationName will have NULL NextStationName and stop the recursion.
  WHERE rn <= 99 ---- To prevent infinite loops
)
SELECT StationName FROM NO2

When I looked at some of the other answers, I was pretty impressed. Swarts STRING_AGG() was pretty cool, and I was glad to see that a lot of the other answers went the recursion route.

I decided to see how mine performed, so I ginned up some more data into the tables to throw at these queries. I dumped a million rows into the Stations table, using

INSERT INTO Stations (StationName, StationPhysicalOrder)
SELECT CAST( NEWID() AS varchar(50) ), n
FROM numbers
WHERE n > 19

to generate some extra (hopefully unique) data. Then created about 500 Override rows. I figured this would give the queries some pretty decent exercise.

DROP TABLE IF EXISTS #tmpInsertOverrrides1
CREATE TABLE #tmpInsertOverrides1 ( rn int identity, f varchar(50) )

DROP TABLE IF EXISTS #tmpInsertOverrrides2
CREATE TABLE #tmpInsertOverrides2 ( rn int identity, t varchar(50) )

INSERT INTO #tmpInsertOverrides1 (f)
SELECT StationName
FROM stations
WHERE StationID IN ( SELECT TOP 500 n FROM numbers ORDER BY NEWID() )

INSERT INTO #tmpInsertOverrides2 (t)
SELECT StationName
FROM stations
WHERE StationID IN ( SELECT TOP 500 n FROM numbers ORDER BY NEWID() )
AND NOT EXISTS (SELECT 1 FROM #tmpInsertOverrides1 WHERE f = StationName)

SELECT * FROM #tmpInsertOverrides2

INSERT INTO StationRoutingOverride (StationFromName, StationToName)
SELECT f, t
FROM #tmpInsertOverrides1 t1
INNER JOIN #tmpInsertOverrides2 t2 ON t1.rn = t2.rn

Again, I tried to be careful to make sure that I didn't have any dupes and that I didn't create any infinite loops. I haven't really done much more than a cursory check on the data to make sure it's good.

I reran my query against the beefed up data. Mine ran in about 2:23 and returned 100 rows. But that was also with the Query Plan turned on.

I tried the string stuffing query, but it broke because my string was too long. (STRING_AGG aggregation result exceeded the limit of 8000 bytes. Use LOB types to avoid result truncation.) That was my fault with the test data. Next one I tried was the function-based approach. But after 13 minutes, I abandoned that one. I'm still not sure if it was something in my data. The query still looks pretty cool. Michael Swart has some pretty cool query ideas.

I tried the Ugly Stations one from Greg Dodd. That one exceeded max recursion, so I didn't fully test at one. Again, the query met the rules, this was my ridiculous data that I threw this at.

I next tried Ilya Makarov's CTE solution. The OUTER APPLY solution was pretty slick. I had initially tried an OUTER APPLY but couldn't get it to work like I wanted. This one ran for almost 2 minutes before it too hit Max Recursion and failed. Again, my data's fault.

Then came Michael Bogdanov's answer. It too crashed on Max Recursion, but it did it in just 7 seconds. This one was flying.

The last one I tried was the Brute Force version by Brian Boodman. It ran for over 7 minutes before I had to kill it. It angered my SSMS.


This was an interesting exercise. And all of these answers followed the rules as they were laid out. Things didn't get weird with some of them until I threw a stupid amount of data at them. That'll cause any query to break, if it wasn't intended to run against that data.

To be fair, the first time I ran my query against the bulky data, it too crashed on Max Recursion. So I took a couple of minutes to refactor part of it and put in a check to stop it if it recursed too far. I'm kind of impressed that it actually finished. And after seeing that it finished quicker than some of the others, I'm not as disappointed with it as I was at first. But again, I also had a bit of an insider's view of the new data I added. That wasn't in the original requirements, so please nobody take this as me bashing their queries. I changed the rules at the end game, and I'm impressed with most of the answers to the given problem. I have a tendency to over-think the negative possibilities which often leads me to over-engineer the solution to the problem.

I think I may have been a bit overzealous with my fake data.


Also, be sure to check out Brent's fantastic blog.

Brent Ozar

Friday, January 15, 2021

The +1 Is Now Done...

 My MCSx journey has come to an end. I previously wrote about obtaining MCSE: Data Management and Analytics., and I have now finished up the one test that I didn't have to take, but just wanted to take. Yesterday, I completed 70-761 to earn a second MCSA. I know it's actually a lower cert than the MCSE I just earned, but I already paid for the study materials and the voucher to take the test. 

My original goal was to take the tests for the two MCSAs and combine them into the MCSE, but that got re-prioritized to just aim for the one MCSA then the MCSE. And to be honest, I originally thought that I could take this test without any effort. It's on SQL Development, which I've been doing for many, many years. I thought I'd just take a practice test to see if there was anything I needed to work on and then go in and take the cert when I did the first test. 

So I took the practice test and got a 36%. Talk about an ego-buster. I was extremely disappointed in myself. I'm not sure what I did, but that test got immediately de-prioritized and moved to the back. I pretty much wrote it off so I could focus on the others that I actually needed to do. 

After I passed the first two, I ordered the study materials for both 70-762 and 70-761 together, since I got a discount for getting both at the same time. I set the 761 book aside and didn't look at it until I finished the MCSE

After Christmas, I pretty much decided I had a month until Microsoft retired the exam, and I had already bought the stuff, so why not. If I borked it again, no biggie; I'd already spent the money. It was very nice to have absolutely no pressure on this thing. 

I used pretty much the same study method I used for the other tests. 

Study materials:
Udemy Courses
MeasureUp tests - which was the practice test that I did so horrible on
Microsoft Press Exam Ref books
Exam Skills
Microsoft.com Exam Reference

MCSA: SQL 2016 Database Development

This test was difficult, but mostly because of some of the answers weren't very good answers. There were several questions that required me to write out the query, that weren't difficult, but were time-consuming. There were other questions that had correct answers that used old-style ANSI 89 JOINS, like SELECT * FROM x,y ..., that absolutely drove me nuts. I really should have left a comment on those questions, but the cert is being retired in a couple of weeks anyway. Meh.

When I submitted the test, I knew I'd passed it this time, but wasn't sure by how much. I crushed it: 900. I feel slightly vindicated for the horrible practice test score. 

I didn't need to take that test, but it added one more cert to my resume. In addition to the previous MCSE: Data Management and Analytics and MCSA: SQL 2016 Database Administration, I now also have a MCSA: SQL 2016 Database Development.  

MCSA: SQL 2016 Database Development
Just in time for retirement

Friday, December 11, 2020

What do you call the person who graduated last in their med school class?

Doctor.

It's an old joke, but one I very much appreciated today. Just like the person last in class, I would like to have done better: I needed 700; I got 721. I was pretty close. But, at the end of the day, this still belongs to me....

Badge for MCSE: Data Management & Analytics

I knocked out 70-762 today. 

Study materials:
Udemy Courses
MeasureUp tests
Microsoft Press Exam Ref books
Exam Skills
Microsoft.com Exam Reference

MCSE: Data Management and Analytics

Since I began my professional geek journey a little over two decades ago, I've known about the big Microsoft MCSE Certification. It's changed a bit over the years, but I honestly never thought I'd be one of the people to obtain one. On my journey to MCSE, I am now 3 for 3 (well, technically 4 for 5, but we don't count MCSA for SQL 2012/14, right?). That's 70-764 and 70-765 for an MCSA: Database Administration. And today caps off 70-762 for an MCSE: Data Management & Analytics

I know Microsoft is retiring MCSx at the end of January, but this was a requirement for where I work. I've  only had a few months to knock out all 3, but I'm a little proud of myself for actually pulling it off. It helped that I've been a database person for several years now, but it REALLY DIDN'T HELP that COVID has played a little bit of havoc with our ability do things like be around a proctor. I've been fortunate that we have a proctor site on base that I can take these tests in. There is absolutely ZERO chance that I could make my home office an acceptable place for an exam proctor.

Since I have access to that on-base proctoring, I think I'll probably also take 70-761 and add on the MCSA: Database Development while I can.

And since Microsoft has apparently decided that Azure will rule the world, I guess I'll be taking a look at that next. 

I'm going to try to enjoy my shiny new MCSE for the last couple of weeks of the sh*t-show that has been 2020, and not even think about certification stuff until January. 

Yeah, right.

Friday, November 27, 2020

Two Down.....

One (+1) to go.... And an MCSA!

Up next is 70-765.

Study materials:


I studied hard for 70-765, and I procrastinated a bit again on going in to take this test. I wanted to knock all 3 of these out before Thanksgiving, but my Imposter Syndrome was really flaring up. I was slow. And then, because of COVID, I was delayed trying to get into my testing center. I could have done a remote proctored exam, but that would pretty much meant finding an alternate, quiet place to go for it. There's no way possible that I'd be able to clear my home office enough to satisfy the proctors. Plus, I've read some horror stories about proctors ending tests if someone just looked away from their screen. No thanks. 

Fortunately, the Army base that I work on has an Authorized Testing Center right across the street from my office, so guess where I decided to take the test.

Unfortunately, the main proctor there was quarantined because of COVID, and the other proctors were either unavailable or had tech issues with Pearson. The clock was ticking, and I was running out of time. 

Fortunately, I was able to schedule a slot THE DAY BEFORE THANKSGIVING! Huge Thank You to the people over there who were able to schedule me in right before a holiday on short notice. 

This test was harder than the first, but once again, I finished with plenty of time left. I think it took me just a little over an hour to finish. I thought I'd tanked this one, too. But the voucher I bought included a free retake, so I could study over Thanksgiving and retake it the next week. No biggie. 

70-765 Badge

Final Score: 790 :-)

It wasn't spectacular, but it was PASSING. I got to breathe a huge sigh of relief and know that I'd get to at least enjoy my Thanksgiving break. 








Best of all, I got to call my wife and tell her I had now earned one of these:





















Now all that's left is just to cap it off with 70-762 and the MCSE. I got this. 

Wednesday, October 21, 2020

One Down ...

Two (+1) to go....

I work as a DBA for a government contractor. I've been in the same seat for almost 4 years, but, since government contracting follows a different set of expectations, I'm currently on my 3rd employer. My first company wanted me to get an MCSA: SQL <WHATEVER>, and I began it on the SQL 2012/2014 path: 70-461, 462, 463

At the time, we were using SQL Server 2008, 2008R2 and 2012. I passed 461, then missed 462 by a couple of points, and while I was studying up to retake 462, we upgraded to SQL 2016 (another fun adventure that I need to write up) and got rid of anything lower. 

For an MCSA on SQL 2016, it only required 70-764 and 765, so it was also 2 more tests. I figured that route would be much more applicable to my work, so I abandoned 462 and switched to studying up for the SQL 2016 certs. Then there was some shuffling at my original company, and I ultimately got traded to another contractor earlier this year. The MCSA was no longer a requirement, and I had a lot of other priorities that caused me to keep procrastinating on the cert. 

Big mistake. I should have finished it, but it is what it is. 

In late July, I got traded again, to my current contractor. They wanted an MCSE: Data Management and Analytics. That's when the real fun began. 

When I dug back into the certs, I found out that Microsoft decided to retire their MCSA and MCSE certs in June. However, because COVID-19 has made 2020 a bit of an "interesting" year, they extended the retirement to January 31, 2021 (after a non-trivial amount of complaints from those currently in the testing cycle). Despite the retirement, my company still wanted the MCSE. Thus restarts the adventure. 

Chosen Path:
Since 762 is part of the MCSA: Database Development, I figured I'd also do test 761 before the end of January and get that MCSA, too. Why not, right?

First up was 70-764

Study materials:
The test was 44 questions and 2 hours to complete. I finished in 58 minutes and had a minor panic attack. I only marked 3 questions for review and it wouldn't let me go back to others. There was no way I could have finished that fast without seriously screwing something up. I went back to do as much review as I could on those 3 questions and finished up with only about another 5 minutes left on the clock. Time to see how bad I failed it...

70-764 Badge

Final Score: 824 :-)

I can promise you there was nobody more surprised than me to see that score pop up. 

But I did it. And I was thrilled

I finished the first one and now only had to knock out two more. (Did I also mention that in October I had to renew my Flight Instructor certificate? Such a fun month. But I digress..)

Now, on to the next one. 

Wednesday, February 10, 2016

Oracle Number Formatting

This may be something you all already know, but it had me scratching my head for a minute. On the one hand, Oracle’s analytic functions like LAST_VALUE make some things kinda cool, but then Oracle smacks me with not handling number masking like pretty much every other language.

Even if you specify a decimal place mask, if the number ends with a 0 in the decimal, it will drop it.

Example:
1) To_Number(44.50,’99.99’) >> 45.5
2) To_Number(100/10.0,’99.99’) >> 10  <<< This one even drops the decimal and returns an integer 10, which leads to a whole different philosophical discussion about the differences between 10 and 10.00.

To get the decimal places mask that we want in Oracle, we have to use To_Char(44.50,’99.99’). But then that value is a string, so you won’t be able to do math on it. Except in ColdFusion, where it’s a numberish string. So to actually format it, we have to remask it in whatever code we use to output the value – ie in ColdFusion = NumberFormat(44.5,’99.90’) >> 44.50 and NumberFormat(100/10.0,’99.00’) >> 10.00

Add this to ColdFusion’s method of determining how to parse a value into JSON, and it makes things super fun. :-/

Moral of my story: If you need a number to be in a specific  format, don’t assume the database will give it back to you the way you thought you told it to.

Monday, December 21, 2015

Oracle Goodness

I've spent years in SQL Land, and I consider myself fairly well informed about data. But after working with some of the people I have, I realize that there is so much that I don't know. I do know enough to do what I need to do without causing smoke and fire and other general mayhem, but just about everything I know now I've had to discover and learn myself. I've had a few good mentors that pointed out ways to tell if I'm writing slow, inefficient queries (and I've done plenty of that), and my views have shifted towards DB performance in my queries. I've also come to realize why software developers and DBAs often seem to be at odds with each other. Some of the worst queries I ever wrote were when I was writing in-code queries and didn't understand their impact. One of my previous databases had several related tables with 10s of millions of rows each. That's not BIG data in the modern sense, but that's A LOT of data. More than enough to let me easily see the difference between my code-based reporting query that ran in 2 hours and the DBA-optimized SQL that ran in 2 seconds. The SQL looked weird, but it was MUCH faster. Big lesson learned there. A database isn't just a place to store and retrieve data. You can get lots more from it by knowing how to ask nicely rather than trying to brute force the data you want. It's about getting SETS of data rather than simply trying to filter data. And it's about letting the database do the job it was made to do.

I like T-SQL, and I learned quite a bit about it in my reporting. And now that I'm back in Developer Land and making the move to an Oracle system, I'm seeing a few things I don't like about PL-SQL, but that's likely because I'm just not used to it. Only time will tell.

However, there are a few things I do like quite a bit. For instance, when making a new stored procedure (my current task), I used to have to do something like this:

IF OBJECT_ID('xxxxx') IS NULL EXEC('CREATE PROCEDURE xxxxx ...') GO 
ALTER PROCEDURE xxxxx ...

Or do several lines of logic with checking sproc existence in a system view. It wasn't hard, but it was long and involved.

MySQL was a bit better, but that wasn't my primary environment:

DROP PROCEDURE IF EXISTS xxxxx ;
CREATE PROCEDURE xxxxx ...

Though dropping and recreating db objects is never nice to existing permissions. :-/

With Oracle, I get to do:

CREATE OR REPLACE xxxxx ...

It's nice to have that kind of functionality already built in without having to worry about killing permissions. SO much easier.

Now, on to build some sprocs....