Welcome to Software Development on Codidact!
Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.
Post History
In SQL Server 2022+, there's a dedicated binning function called DATE_BUCKET. Each of these statements increments DATE_BUCKET with a bucket width of 1 from the origin time: DECLARE @date DATETI...
Answer
#2: Post edited
- In SQL Server 2022+, there's a dedicated binning function [called `DATE_BUCKET`][date_bucket].
- > Each of these statements increments `DATE_BUCKET` with a bucket width of `1` from the origin time:
- >
- > ```sql
- > DECLARE @date DATETIME2 = '2020-04-30 21:21:21';
- > SELECT 'Week', DATE_BUCKET(WEEK, 1, @date)
- > UNION ALL
- > SELECT 'Day', DATE_BUCKET(DAY, 1, @date)
- > UNION ALL
- > SELECT 'Hour', DATE_BUCKET(HOUR, 1, @date)
- > UNION ALL
- > SELECT 'Minutes', DATE_BUCKET(MINUTE, 1, @date)
- > UNION ALL
- > SELECT 'Seconds', DATE_BUCKET(SECOND, 1, @date);
- > ```
- > Here's the result set.
- >
- > | | |
- > |-|-|
- > | Week | 2020-04-27 00:00:00.0000000 |
- > | Day | 2020-04-30 00:00:00.0000000 |
- > | Hour | 2020-04-30 21:00:00.0000000 |
- > | Minutes | 2020-04-30 21:21:00.0000000 |
- > | Seconds | 2020-04-30 21:21:21.0000000 |
- In older versions of SQL Server, [I did date binning][so] with `DATEADD` and `DATEDIFF`, rather than offsetting integers or casting to a string.
- Taking your second faux-CTE temporary table, it would look like this:
- ```sql
- SELECT DATEADD(MONTH, FLOOR(DATEDIFF(MONTH, '2000', q.CreationDate)), '2000') AS bin,
- COUNT(1) Cnt
- INTO #q_cte
- FROM Posts AS q
- INNER JOIN Users AS qu
- ON qu.Id = q.OwnerUserId
- AND qu.Reputation >= 1000
WHERE q.CreationDate BETWEEN '20190101' AND '20210301'AND q.PostTypeId = 1- GROUP BY DATEDIFF(MONTH, '2000', q.CreationDate)
- ```
I'm not sure if it's faster, but I think you could write your query with a single pass through Questions, recording whether each one is answered or not, and group those.- ```sql
SELECT DATEADD(MONTH, ...) AS bin,COALESCE((SELECT 1 FROM ...), 0) AS is_answered,q.ID- INTO #q_cte
- FROM Posts AS q
- WHERE ...;
- SELECT bin, 100. * SUM(is_answered) / COUNT(*) AS ratio
- FROM #q_cte
- GROUP BY bin
- ```
- [date_bucket]: https://learn.microsoft.com/en-us/sql/t-sql/functions/date-bucket-transact-sql?view=sql-server-ver16
- [so]: https://stackoverflow.com/a/41944083/241211
- In SQL Server 2022+, there's a dedicated binning function [called `DATE_BUCKET`][date_bucket].
- > Each of these statements increments `DATE_BUCKET` with a bucket width of `1` from the origin time:
- >
- > ```sql
- > DECLARE @date DATETIME2 = '2020-04-30 21:21:21';
- > SELECT 'Week', DATE_BUCKET(WEEK, 1, @date)
- > UNION ALL
- > SELECT 'Day', DATE_BUCKET(DAY, 1, @date)
- > UNION ALL
- > SELECT 'Hour', DATE_BUCKET(HOUR, 1, @date)
- > UNION ALL
- > SELECT 'Minutes', DATE_BUCKET(MINUTE, 1, @date)
- > UNION ALL
- > SELECT 'Seconds', DATE_BUCKET(SECOND, 1, @date);
- > ```
- > Here's the result set.
- >
- > | | |
- > |-|-|
- > | Week | 2020-04-27 00:00:00.0000000 |
- > | Day | 2020-04-30 00:00:00.0000000 |
- > | Hour | 2020-04-30 21:00:00.0000000 |
- > | Minutes | 2020-04-30 21:21:00.0000000 |
- > | Seconds | 2020-04-30 21:21:21.0000000 |
- In older versions of SQL Server, [I did date binning][so] with `DATEADD` and `DATEDIFF`, rather than offsetting integers or casting to a string.
- Taking your second faux-CTE temporary table, it would look like this:
- ```sql
- SELECT DATEADD(MONTH, FLOOR(DATEDIFF(MONTH, '2000', q.CreationDate)), '2000') AS bin,
- COUNT(1) Cnt
- INTO #q_cte
- FROM Posts AS q
- INNER JOIN Users AS qu
- ON qu.Id = q.OwnerUserId
- AND qu.Reputation >= 1000
- WHERE q.CreationDate BETWEEN '20190101' AND '20210301'
- AND q.PostTypeId = 1
- GROUP BY DATEDIFF(MONTH, '2000', q.CreationDate)
- ```
- I'm not sure if it's faster, but I think you could write your query with a single pass through the Question posts. Record whether each one is answered or not, and group those.
- ```sql
- SELECT q.ID, -- Not actually necessary. Just a helpful check.
- DATEADD(MONTH, ...) AS bin,
- COALESCE((SELECT TOP(1) 1 FROM ...), 0) AS is_answered
- INTO #q_cte
- FROM Posts AS q
- WHERE ...;
- SELECT bin, 100. * SUM(is_answered) / COUNT(*) AS ratio
- FROM #q_cte
- GROUP BY bin
- ```
- [date_bucket]: https://learn.microsoft.com/en-us/sql/t-sql/functions/date-bucket-transact-sql?view=sql-server-ver16
- [so]: https://stackoverflow.com/a/41944083/241211
#1: Initial revision
In SQL Server 2022+, there's a dedicated binning function [called `DATE_BUCKET`][date_bucket]. > Each of these statements increments `DATE_BUCKET` with a bucket width of `1` from the origin time: > > ```sql > DECLARE @date DATETIME2 = '2020-04-30 21:21:21'; > SELECT 'Week', DATE_BUCKET(WEEK, 1, @date) > UNION ALL > SELECT 'Day', DATE_BUCKET(DAY, 1, @date) > UNION ALL > SELECT 'Hour', DATE_BUCKET(HOUR, 1, @date) > UNION ALL > SELECT 'Minutes', DATE_BUCKET(MINUTE, 1, @date) > UNION ALL > SELECT 'Seconds', DATE_BUCKET(SECOND, 1, @date); > ``` > Here's the result set. > > | | | > |-|-| > | Week | 2020-04-27 00:00:00.0000000 | > | Day | 2020-04-30 00:00:00.0000000 | > | Hour | 2020-04-30 21:00:00.0000000 | > | Minutes | 2020-04-30 21:21:00.0000000 | > | Seconds | 2020-04-30 21:21:21.0000000 | In older versions of SQL Server, [I did date binning][so] with `DATEADD` and `DATEDIFF`, rather than offsetting integers or casting to a string. Taking your second faux-CTE temporary table, it would look like this: ```sql SELECT DATEADD(MONTH, FLOOR(DATEDIFF(MONTH, '2000', q.CreationDate)), '2000') AS bin, COUNT(1) Cnt INTO #q_cte FROM Posts AS q INNER JOIN Users AS qu ON qu.Id = q.OwnerUserId AND qu.Reputation >= 1000 WHERE q.CreationDate BETWEEN '20190101' AND '20210301' AND q.PostTypeId = 1 GROUP BY DATEDIFF(MONTH, '2000', q.CreationDate) ``` I'm not sure if it's faster, but I think you could write your query with a single pass through Questions, recording whether each one is answered or not, and group those. ```sql SELECT DATEADD(MONTH, ...) AS bin, COALESCE((SELECT 1 FROM ...), 0) AS is_answered, q.ID INTO #q_cte FROM Posts AS q WHERE ...; SELECT bin, 100. * SUM(is_answered) / COUNT(*) AS ratio FROM #q_cte GROUP BY bin ``` [date_bucket]: https://learn.microsoft.com/en-us/sql/t-sql/functions/date-bucket-transact-sql?view=sql-server-ver16 [so]: https://stackoverflow.com/a/41944083/241211