Sunday, June 15, 2025

Saturday, June 14, 2025

SQL Exercise From Brent Ozar

This is coming from:  https://www.brentozar.com/archive/2025/06/query-exercise-return-routes-in-the-right-order/

I started this exercise when Brent first posted it, but misread the directions at first and accidentally got most of the right answer, but in the wrong way. I wasn't really selecting the true next Station, I was just iterating through them all and picking up the one that should be next. So I set it aside and forgot about it.

Now that all the answers are out, I was reminded that I needed to finish. Before looking at the answers, I went back to try my hand again. I went a recursive CTE route, but I was unhappy with the Query Plan. I did get the right answer, but it looked gross to me. However, it worked, and it was fewer than 20 lines, so I wasn't too unhappy. It did take me much longer to noodle through than it should have: about 2 hours, going down a couple of wrong paths before I finally found one I was happy with.

; WITH StationList AS (
  ---- GENERATE THE LIST WITH THE NextStationName.
  SELECT s.StationID
     , s.StationName
     , s.StationPhysicalOrder
    , COALESCE( sro.StationToName, LEAD(s.StationName) OVER (ORDER BY s.StationPhysicalOrder) ) AS NextStationName
     ---- If there's an Override Station, use that. Otherwise use regular Station.
  FROM Stations s
  LEFT OUTER JOIN StationRoutingOverride sro ON sro.StationFromName = s.StationName
)
, NO2 AS (
  ---- RECURSIVE CTE TO GET NEXT STATION
  SELECT sl.StationName, sl.NextStationName, rn=1
  FROM StationList sl WHERE sl.StationName = 'A'
  ---- sl.StationName can be changed to start from any station.

  UNION ALL

  SELECT sl2.StationName, sl2.NextStationName, rn+1 AS rn
  FROM StationList sl2
  INNER JOIN NO2 ON no2.NextStationName = sl2.StationName
  ---- Last StationName will have NULL NextStationName and stop the recursion.
  WHERE rn <= 99 ---- To prevent infinite loops
)
SELECT StationName FROM NO2

When I looked at some of the other answers, I was pretty impressed. Swarts STRING_AGG() was pretty cool, and I was glad to see that a lot of the other answers went the recursion route.

I decided to see how mine performed, so I ginned up some more data into the tables to throw at these queries. I dumped a million rows into the Stations table, using

INSERT INTO Stations (StationName, StationPhysicalOrder)
SELECT CAST( NEWID() AS varchar(50) ), n
FROM numbers
WHERE n > 19

to generate some extra (hopefully unique) data. Then created about 500 Override rows. I figured this would give the queries some pretty decent exercise.

DROP TABLE IF EXISTS #tmpInsertOverrrides1
CREATE TABLE #tmpInsertOverrides1 ( rn int identity, f varchar(50) )

DROP TABLE IF EXISTS #tmpInsertOverrrides2
CREATE TABLE #tmpInsertOverrides2 ( rn int identity, t varchar(50) )

INSERT INTO #tmpInsertOverrides1 (f)
SELECT StationName
FROM stations
WHERE StationID IN ( SELECT TOP 500 n FROM numbers ORDER BY NEWID() )

INSERT INTO #tmpInsertOverrides2 (t)
SELECT StationName
FROM stations
WHERE StationID IN ( SELECT TOP 500 n FROM numbers ORDER BY NEWID() )
AND NOT EXISTS (SELECT 1 FROM #tmpInsertOverrides1 WHERE f = StationName)

SELECT * FROM #tmpInsertOverrides2

INSERT INTO StationRoutingOverride (StationFromName, StationToName)
SELECT f, t
FROM #tmpInsertOverrides1 t1
INNER JOIN #tmpInsertOverrides2 t2 ON t1.rn = t2.rn

Again, I tried to be careful to make sure that I didn't have any dupes and that I didn't create any infinite loops. I haven't really done much more than a cursory check on the data to make sure it's good.

I reran my query against the beefed up data. Mine ran in about 2:23 and returned 100 rows. But that was also with the Query Plan turned on.

I tried the string stuffing query, but it broke because my string was too long. (STRING_AGG aggregation result exceeded the limit of 8000 bytes. Use LOB types to avoid result truncation.) That was my fault with the test data. Next one I tried was the function-based approach. But after 13 minutes, I abandoned that one. I'm still not sure if it was something in my data. The query still looks pretty cool. Michael Swart has some pretty cool query ideas.

I tried the Ugly Stations one from Greg Dodd. That one exceeded max recursion, so I didn't fully test at one. Again, the query met the rules, this was my ridiculous data that I threw this at.

I next tried Ilya Makarov's CTE solution. The OUTER APPLY solution was pretty slick. I had initially tried an OUTER APPLY but couldn't get it to work like I wanted. This one ran for almost 2 minutes before it too hit Max Recursion and failed. Again, my data's fault.

Then came Michael Bogdanov's answer. It too crashed on Max Recursion, but it did it in just 7 seconds. This one was flying.

The last one I tried was the Brute Force version by Brian Boodman. It ran for over 7 minutes before I had to kill it. It angered my SSMS.


This was an interesting exercise. And all of these answers followed the rules as they were laid out. Things didn't get weird with some of them until I threw a stupid amount of data at them. That'll cause any query to break, if it wasn't intended to run against that data.

To be fair, the first time I ran my query against the bulky data, it too crashed on Max Recursion. So I took a couple of minutes to refactor part of it and put in a check to stop it if it recursed too far. I'm kind of impressed that it actually finished. And after seeing that it finished quicker than some of the others, I'm not as disappointed with it as I was at first. But again, I also had a bit of an insider's view of the new data I added. That wasn't in the original requirements, so please nobody take this as me bashing their queries. I changed the rules at the end game, and I'm impressed with most of the answers to the given problem. I have a tendency to over-think the negative possibilities which often leads me to over-engineer the solution to the problem.

I think I may have been a bit overzealous with my fake data.


Also, be sure to check out Brent's fantastic blog.

Brent Ozar

Wednesday, October 2, 2024

My Presentation At ColdFusion Summit 2024


I recently completed my presentation on Impostor Syndrome, I'm Just Here For The T-Shirt, at Adobe ColdFusion Summit 2024. Several people asked me for the link, so here it is: https://slides.com/codefumonkey/hfts-cfs24

I had a great turnout and received some very good questions and complements after my session. I gave away a copy of Valerie Young's book, but I apologize, because I forgot to write down the recipient's name, so please shoot me an email if you see this. 

I'll also do my best to post my write-up of the conference after I get home. 

 Thank you to Adobe for hosting a great conference. 

 And a Great Big Thank You to those of you who came out to my session!

Saturday, January 13, 2024

Another Holiday Puzzle - Hanukkah of Data - Speed Run

Hanakkuh of Data

A database puzzle

** WARNING: SPOILERS AHEAD **

** SPEEDRUN **

The SpeedRun wasn't a whole lot different than the original run through. There was a new, slightly more complex dataset, but the problems were almost exactly the same.

** PART 0 **

The code to unlock the files was the same as the previous set.

** PART 1 **

This one was also the same, but with the new data, it returned a different person to seed the rest of the problems with.

** PART 2 **

This one was also very similar, but the initials of the Contractor were different. There was also a slight change in the product SKUs that neded to be applied.

** PART 3 **

Since the person from the previous query changed, we had to change that value in the speedrun query. It also changed the birthdate of the Neighbor, so that required a couple of seconds of Google-fu to find dates for Libras in The Year of the Goat.

** PART 4 **

This one didn't require any changes. It pulled up different results, but otherwise was the same as the regular queries.

** PART 5 **

I had to run this one twice. This was the only one that I missed on the first try. I didn't catch that the Cat Lady was no longer in "Staten Island". My initial try gave me the wrong person, but when I removed that condition, I got the correct answer.

** PART 6 **

This one required no changes. I just returned a different person who was used in Part 7.

** PART 7 **

This one took me the longest of all of these. The data changes required me to make a few additions to the query. I changed the customer.id in the WHERE filters and added two new SKUs (HOM and TOY). There was an issue with my JOIN condition where I spit out the colors, but removing that condition gave me the right answer. Since color was such an important part of the original query, I'm guessing that this condition was probably intended, but I got lucky with my query. I'll take another look at this one, but for now, I've got the right answer.

** PART 8 **

The final question once again didn't require any changes.

The Speed Run was also pretty fun, but since the questions were essentially the same, not a lot of figuring was required for the queries. I don't know if that just means the data wasn't that much different or that I just wrote the queries originally that caught issues that were introduced in the speed run.

It took me 31 minutes to complete, and I only had one wrong answer in Part 5.





BACK TO PART 1
BACK TO PART 2
BACK TO PART 3
BACK TO PART 4
BACK TO PART 5
BACK TO PART 6
BACK TO PART 7
BACK TO PART 8

Thursday, January 4, 2024

Another Holiday Puzzle - Hanukkah of Data - Part 8

Hanakkuh of Data

A database puzzle

** WARNING: SPOILERS AHEAD **

** DAY 8 - THE COLLECTOR **

"Oh that damned woman! She moved in, clogged my bathtub, left her coupons all over the kitchen, and then just vanished one night without leaving so much as a note.

Except she did leave behind that nasty carpet. I spent months cleaning one corner, only to discover a snake hiding in the branches! I knew then that she was never coming back, and I had to get it out of my sight.

"Well, I don't have any storage here, and it didn't seem right to sell it, so I gave it to my sister. She wound up getting a newer and more expensive carpet, so she gave it to an acquaintance of hers who collects all sorts of junk. Apparently he owns an entire set of Noah's collectibles! He probably still has the carpet, even.

"My sister is away for the holidays, but I can have her call you in a few weeks."

The family dinner is tonight! Can you find the collector's phone number in time?

This is the final day, and it seems way too easy. The only things that look like clues are the statement about the acquaintance who collects all sorts of junk and that he owns an "entire set of Noah's collectibles". This seems like I'm looking for someone who purchased a lot of the items that are labeled 'Noah%'.

What we know:

  1. The Collector has an entire set of Noah's collectibles.

So first, I need to look at the products to verify which ones are Noah's Collectibles.


        SELECT * FROM products WHERE descr LIKE 'Noah%' OR sku LIKE 'COL%'
    

This shows that I won't be able to use products.sku, because some of the items are listed under 'COL%' and some under 'TOY%'. All of the 'COL' items are Noah's Collectibles, but the 'TOY' items aren't. But it does look like I can grab all of the collectibles if I just use product.descr LIKE 'Noah%`. Of course, this is another assumption on my part. Are the 'TOY's also Collectibles, or is it just items listed with 'COL'?

    SELECT c.customerid, c.name, c.phone
    FROM customers c
    INNER JOIN (
        SELECT TOP 1 o.customerid, count(*) AS cnt
        FROM orders o
        INNER JOIN orders_items oi ON o.orderid = oi.orderid
        INNER JOIN products p ON oi.sku = p.sku
            AND p.descr LIKE 'Noah%'
        GROUP BY o.customerid
        ORDER BY cnt DESC
    ) s1 ON c.customerid = s1.customerid

Entering the phone number shows this is correct, too.

All days have been completed.


"Oh yes, that magnificent Persian carpet! An absolute masterpiece, with a variety of interesting animals congregating around a Tree of Life. As a collector, I couldn't believe when it fell into my lap.

"A friend of mine had taken it off her brother's hands, and she didn't know what to do with it. I saw her one day, and she was about to put an old rug out at the curb. It looked like it had been through a lot, but it was remarkably not that dirty. It still took quite a bit of effort and no small amount of rug cleaner, but ultimately, I managed to get the last bits of grime out of it.

"I actually live right down the street from Noah's Market - I'm a huge fan and I shop there all the time! I even have a one-of-a-kind scale model of Noah's Ark that makes a complete set of Noah's collectibles.

"I would love for Noah to have his rug once again to enjoy."

** CONCLUSION **

These were interesting puzzles. The querying part wasn't that difficult, but just figuring out what was being asked was extremely difficult for some of these. I'm not sure that this really tested my SQL skills as much as it tested my ability to determine what it was that the customer was actually asking for.

Overall, it was pretty fun. Now that we've completed the first batch, I think I'll try to see what the speed run is like.





BACK TO PART 1
BACK TO PART 2
BACK TO PART 3
BACK TO PART 4
BACK TO PART 5
BACK TO PART 6
BACK TO PART 7
BACK TO PART 8

Another Holiday Puzzle - Hanukkah of Data - Part 7

Hanakkuh of Data

A database puzzle

** WARNING: SPOILERS AHEAD **

** DAY 7 - THE MEET CUTE **

"Oh that tapestry, with the colorful toucan on it! I'll tell you what happened to it.

"One day, I was at Noah's Market, and I was just about to leave when someone behind me said 'Miss! You dropped something!'

"Well I turned around to see this cute guy holding an item I had bought. He said, 'I got the same thing!' We laughed about it and wound up swapping items because I wanted the color he got. We had a moment when our eyes met and my heart stopped for a second. I asked him to get some food with me and we spent the rest of the day together.

"Before long I moved into his place, but the romance faded quickly, as he wasn't the prince I imagined. I left abruptly one night, forgetting the tapestry on his wall. But by then, it symbolized our love, and I wanted nothing more to do with it. For all I know, he still has it."

Can you figure out her ex-boyfriend's phone number?

This is another one where we'll need info from the previous query. We need to know the customerid of The Bargain Hunter, since the Ex-boyfriend bought the same item at the same time. But the item was a different color, so we'll also need to look at Products with a color description. I think that's all the actual clues in the text.

What we know:

  1. The Ex-boyfriend bought an item at Noah's Market the same day as The Bargain Hunter.
  2. The Bargain Hunter's customerid is 4167
  3. Colored items are designated in product.sku as 'COL%'

So first, let's find the orders that were placed on the same day as The Bargain Hunter.

    SELECT o1.customerid, o1.orderid, o2.customerid, o2.orderid
    FROM orders o1
    INNER JOIN orders o2 ON o1.customerid <> o2.customerid
        AND YEAR(o1.ordered) = YEAR(o2.ordered)
        AND MONTH(o1.ordered) = MONTH(o2.ordered)
        AND DAY(o1.ordered) = DAY(o2.ordered)
    WHERE o1.customerid = 4167

This gives us over 3000 orders. This also shows that we probably need to simplify date searching a little bit. orders.ordered is a datetime datatype. We want to truncate that date to just `YYYY-MM-DD`, and we can do this by simply using CAST(orders.ordered AS date). That will trim off the time part. That way we aren't pulling out each part. We can go back to that if we need to later.

We also need to narrow these orders down to just those items with a color listed. Looking at the data shows us that it's part of products.descr.

    /* Get Sherri's orders */
    SELECT o.ordered, oi.sku, p.descr, CAST(o.ordered AS date) AS truncDate
    FROM orders o
    INNER JOIN orders_items oi ON o.orderid = oi.orderid
    INNER JOIN products p ON oi.sku = p.sku
        AND p.sku LIKE 'COL%'
    WHERE o.customerid = 4167

This drops it down to just 4 orders. That should be easy to work with. Now we just need to find other orders placed on that date.

    /* Find Orders from same time as Sherri's */
    ; WITH sherriOrders AS (
        SELECT o.ordered, oi.sku, p.descr, CAST(o.ordered AS date) AS truncDate
        FROM orders o
        INNER JOIN orders_items oi ON o.orderid = oi.orderid
        INNER JOIN products p ON oi.sku = p.sku
            AND p.sku LIKE 'COL%'
        WHERE o.customerid = 4167
    )
    SELECT c.customerid, c.name, c.phone, o.*, so.*,  ABS(DATEDIFF(hour,o.ordered,so.ordered)) AS diffHrs
    FROM orders o
    INNER JOIN orders_items oi ON o.orderid = oi.orderid
    INNER JOIN products p ON oi.sku = p.sku
        --AND p.sku LIKE 'COL%'
    INNER JOIN sherriOrders so ON oi.sku = so.sku
        AND CAST(o.ordered AS date) = so.truncDate
        AND o.ordered <= so.ordered
        --AND DATEDIFF(hour,o.ordered,so.ordered) < 1 /* Orders less than 1 hour apart. */
    INNER JOIN customers c ON o.customerid = c.customerid
    WHERE o.customerid <> 4167 /* Exclude Sherri */

    /* NO RESULTS */
    

That's not good. That shouldn't have eliminated everything. There should be other orders that happened at the same time.

After digging at the data, I realized that I was eliminating the data that I actually needed. I was comparing Product SKUs from both orders, but the text says that these items were different colors. They'll have different SKUs, so I need to compare the item name without the colors.

    /* Find Orders from same time as Sherri's */
    ; WITH sherriOrders AS (
        SELECT o.ordered, oi.sku, p.descr
            , SUBSTRING(p.descr,1,CHARINDEX('(',p.descr)-2) AS prodName
            , CAST(o.ordered AS date) AS truncDate
        FROM orders o
        INNER JOIN orders_items oi ON o.orderid = oi.orderid
        INNER JOIN products p ON oi.sku = p.sku
            AND p.sku LIKE 'COL%'
        WHERE o.customerid = 4167 /* Sherri */
    )
    SELECT c.customerid, c.name, c.phone --,  o.*, oi.*,p.*, so.*
    FROM orders o
    INNER JOIN customers c ON o.customerid = c.customerid
    INNER JOIN orders_items oi ON o.orderid = oi.orderid
    INNER JOIN products p ON oi.sku = p.sku
        AND p.sku LIKE 'COL%'
    INNER JOIN sherriOrders so ON
        SUBSTRING(p.descr,1, CASE WHEN CHARINDEX('(',p.descr) IS NULL THEN LEN(p.descr) ELSE CHARINDEX('(',p.descr)-2 END ) = so.prodName /* THIS IS AN INSANE SPLITTING */
        AND CAST(o.ordered AS date) = so.truncDate
        AND ABS(DATEDIFF( minute, o.ordered,so.ordered )) < 1 /* USE ABS TO ELIMINATE NEGATIVES */
    WHERE o.customerid <> 4167

    /*
    customerid  name          phone
    5783        Carlos Myers  838-335-7157
    */
    

That one gets us down to just one result. Entering that phone number gives us another right answer.





GO TO DAY 8

BACK TO PART 1
BACK TO PART 2
BACK TO PART 3
BACK TO PART 4
BACK TO PART 5
BACK TO PART 6

Another Holiday Puzzle - Hanukkah of Data - Part 6

Hanakkuh of Data

A database puzzle

** WARNING: SPOILERS AHEAD **

** DAY 6 - THE BARGAIN HUNTER **

"Why yes, I did have that rug for a little while in my living room! My cats can't see a thing but they sure chased after the squirrel on it like it was dancing in front of their noses.

"It was a nice rug and they were surely going to ruin it, so I gave it to my cousin, who was moving into a new place that had wood floors.

"She refused to buy a new rug for herself - she said they were way too expensive. She's always been very frugal, and she clips every coupon and shops every sale at Noah's Market. In fact I like to tease her that Noah actually loses money whenever she comes in the store.

"I think she's been taking it too far lately though. Once the subway fare increased, she stopped coming to visit me. And she's really slow to respond to my texts. I hope she remembers to invite me to the family reunion next year."

Can you find her cousin's phone number?

Hmmm.... seems like there's some misdirection language in here. The Cat Lady talks about her cousin with wood floors. That's not really tracked in our data, so that can't really be a clue. The only thing she really says though is that her cousin is very cheap and shops only sales at Noah's Market. I think this one took sparse clues to a whole new level.

What we know:

  1. Cousin is very cheap.
  2. Cousin shops every sale at Noah's Market. She mentions Noah loses money when she comes in, but is that a clue?

These "clues" are getting very vague. They require a significant amount of interpretation of what the text even says. I'm not even sure exactly what clues this one is giving me.

The only clue I really see is that this Cousin shops all sales at Noah's. There doesn't seem to be any direct indicator of what a sale is, but there is a product.wholesale_cost column and an orders_items.unit_price columns. Hopefully, if I compare these numbers and something will pop out.

    /* Look for sales orders that were less than normal Product price. */
    SELECT *
    FROM orders_items oi
    INNER JOIN products p ON oi.sku = p.sku
    WHERE oi.unit_price < p.wholesale_cost

Good. I got back over 2000 rows, so this should be a way I can figure out what was on sale.

    /* Who made those Orders */
    ; WITH salesOrders AS (
        SELECT TOP 1 o.customerid, count(*) AS cnt
        FROM orders_items oi
        INNER JOIN products p ON oi.sku = p.sku
        INNER JOIN orders o ON oi.orderid = o.orderid
        WHERE oi.unit_price < p.wholesale_cost
        GROUP BY o.customerid
        ORDER BY cnt DESC
    )
    SELECT c.customerid, c.name, c.phone
    FROM customers c
    INNER JOIN salesOrders so ON c.customerid = so.customerid

    /*
    customerid  name         phone
    4167        Sherri Long  585-838-9161
    */
    

After GROUPing, JOINing and ORDERing those records to find who had the most orders on sale, I came back with only one record. Entering her phone number shows her as the correct answer again.

This one was mind-bogglingly difficult just to get what the clue was. It required a little bit of looking at the words in the problem and then checking the data to realize there were differences in pricing between sale and wholesale. These clues make you think about more than just data.





GO TO DAY 7

BACK TO PART 1
BACK TO PART 2
BACK TO PART 3
BACK TO PART 4
BACK TO PART 5