問題描述
在 SQL Server 2008 中運行 WHILE 或 CURSOR 或兩者 (Running WHILE or CURSOR or both in SQL Server 2008)
I am trying to run a loop of some sort in SQL Server 2008/TSQL and I am unsure whether this should be a WHILE
or CURSOR
or both. The end result is I am trying to loop through a list of user logins, then determine the unique users, then run a loop to determine how many visits it took for the user to be on the site for 5 minutes , broken out by the channel.
Table: LoginHistory
UserID Channel DateTime DurationInSeconds
1 Website 1/1/2013 1:13PM 170
2 Mobile 1/1/2013 2:10PM 60
3 Website 1/1/2013 3:10PM 180
4 Website 1/1/2013 3:20PM 280
5 Website 1/1/2013 5:00PM 60
1 Website 1/1/2013 5:05PM 500
3 Website 1/1/2013 5:45PM 120
1 Mobile 1/1/2013 6:00PM 30
2 Mobile 1/1/2013 6:10PM 90
5 Mobile 1/1/2013 7:30PM 400
3 Website 1/1/2013 8:00PM 30
1 Mobile 1/1/2013 9:30PM 200
SQL Fiddle to this schema
I can select
the unique users into a new table like so:
SELECT UserID
INTO #Users
FROM LoginHistory
GROUP BY UserID
Now, the functionality I'm trying to develop is to loop over these unique UserIDs, order the logins by DateTime, then count the number of logins needed to get to 300 seconds.
The result set I would hope to get to would look something like this:
UserID TotalLogins WebsiteLogins MobileLogins Loginsneededto5Min
1 4 2 2 2
2 2 2 0 0
3 3 3 0 3
4 1 1 0 0
5 2 1 1 2
If I were performing this in another language, I would think it would something like this: (And apologies because this is not complete, just where I think I am going)
for (i in #Users):
TotalLogins = Count(*),
WebsiteLogins = Count(*) WHERE Channel = 'Website',
MobileLogins = Count(*) WHERE Channel = 'Mobile',
for (i in LoginHistory):
if Duration < 300:
count(NumLogins) + 1
** Ok ‑ I'm laughing at myself the way I combined multiple different languages/syntaxes, but this is how I am thinking about solving this **
Thoughts on a good way to accomplish this? My preference is to use a loop so I can continue to write if/then
logic into the code.
‑‑‑‑‑
參考解法
方法 1:
Ok, this is one of those times where a CURSOR
would probably outperform a set based solution. Sadly, I'm not very good with cursors, so I can give you a set base solution for you to try:
;WITH CTE AS
(
SELECT *, ROW_NUMBER() OVER(PARTITION BY UserID ORDER BY [DateTime]) RN
FROM UserLogins
), CTE2 AS
(
SELECT *, 1 RecursionLevel
FROM CTE
WHERE RN = 1
UNION ALL
SELECT B.UserID, B.Channel, B.[DateTime],
A.DurationInSeconds+B.DurationInSeconds,
B.RN, RecursionLevel+1
FROM CTE2 A
INNER JOIN CTE B
ON A.UserID = B.UserID AND A.RN = B.RN ‑ 1
)
SELECT A.UserID,
COUNT(*) TotalLogins,
SUM(CASE WHEN Channel = 'Website' THEN 1 ELSE 0 END) WebsiteLogins,
SUM(CASE WHEN Channel = 'Mobile' THEN 1 ELSE 0 END) MobileLogins,
ISNULL(MIN(RecursionLevel),0) LoginsNeedeto5Min
FROM UserLogins A
LEFT JOIN ( SELECT UserID, MIN(RecursionLevel) RecursionLevel
FROM CTE2
WHERE DurationInSeconds > 300
GROUP BY UserID) B
ON A.UserID = B.UserID
GROUP BY A.UserID
方法 2:
A slightly different piece‑wise approach. A minor difference is that the recursive portion terminates when it reaches 300 seconds for each user rather than summing all of the available logins.
An index on UserId
/StartTime
should improve performance on larger datasets.
declare @Logins as Table ( UserId Int, Channel VarChar(10), StartTime DateTime, DurationInSeconds Int )
insert into @Logins ( UserId, Channel, StartTime, DurationInSeconds ) values
( 1, 'Website', '1/1/2013 1:13PM', 170 ),
( 2, 'Mobile', '1/1/2013 2:10PM', 60 ),
( 3, 'Website', '1/1/2013 3:10PM', 180 ),
( 4, 'Website', '1/1/2013 3:20PM', 280 ),
( 5, 'Website', '1/1/2013 5:00PM', 60 ),
( 1, 'Website', '1/1/2013 5:05PM', 500 ),
( 3, 'Website', '1/1/2013 5:45PM', 120 ),
( 1, 'Mobile', '1/1/2013 6:00PM', 30 ),
( 2, 'Mobile', '1/1/2013 6:10PM', 90 ),
( 5, 'Mobile', '1/1/2013 7:30PM', 400 ),
( 3, 'Website', '1/1/2013 8:00PM', 30 ),
( 1, 'Mobile', '1/1/2013 9:30PM', 200 )
select * from @Logins
; with MostRecentLogins as (
‑‑ Logins with flags for channel and sequenced by StartTime (ascending) for each UserId .
select UserId, Channel, StartTime, DurationInSeconds,
case when Channel = 'Website' then 1 else 0 end as WebsiteLogin,
case when Channel = 'Mobile' then 1 else 0 end as MobileLogin,
Row_Number() over ( partition by UserId order by StartTime ) as Seq
from @Logins ),
CumulativeDuration as (
‑‑ Start with the first login for each UserId .
select UserId, Seq, DurationInSeconds as CumulativeDurationInSeconds
from MostRecentLogins
where Seq = 1
union all
‑‑ Accumulate additional logins for each UserId until the running total exceeds 300 or they run out of logins.
select CD.UserId, MRL.Seq, CD.CumulativeDurationInSeconds + MRL.DurationInSeconds
from CumulativeDuration as CD inner join
MostRecentLogins as MRL on MRL.UserId = CD.UserId and MRL.Seq = CD.Seq + 1 and CD.CumulativeDurationInSeconds < 300 )
‑‑ Display the summary.
select UserId, Sum( WebsiteLogin + MobileLogin ) as TotalLogins,
Sum( WebsiteLogin ) as WebsiteLogins, Sum( MobileLogin ) as MobileLogins,
( select Max( Seq ) from CumulativeDuration where UserId = LT3.UserId and CumulativeDurationInSeconds >= 300 ) as LoginsNeededTo5Min
from MostRecentLogins as LT3
group by UserId
order by UserId
Note that your sample results seem to have an error. UserId
3 reaches 300 seconds in two calls: 180 + 120. Your example shows three calls.
(by mikebmassey、Lamak、HABO)