An investment in knowledge pays the best interest. - Benjamin Franklin

Top Ten Tags

Who's Online

In Microsoft Access, you can create running totals in reports by placing a textbox bound to the value field, and setting the 'Running Sum' property to true. You can set the running total to be a running total over the lowest level group (i.e. at the detail level) or over all the records returned for the query.

You can also do a running count by adding a textbox and setting its value to 1, and then add another textbox to do a running sum over that text box. Using a combination of the running total and running count, you can create a running average.

Ideally, a report is the to do these type of calculations. But there may be occasions when you need to have these calculations done is a query.

Microsoft has a KB article demonstrating how to accomplish running sums in query using the DSum() function. While this method works, it is extremely inefficient (i.e. slow).

Allen Browne: DLookup(), DSum(), etc are slow to execute. They involve VBA calls, Expression Service calls, and they waste resources (opening additional connections to the data file.) Particularly if JET must perform the operation on each row of a query, this really bogs things down.

A subquery will be considerably faster than a domain aggregate function.

[Note: The sample database includes a query (qryDomainFunc) that uses this method as a demonstration only.]

There is another way to accomplish a running sum using a correlated subquery. Using a subquery is more efficient (i.e. faster) then using aggregate functions such as DSum() and DCount().

Alternatively, you can use VBA to create a table that contains the details and running calculations. This is the most efficient method but requires that the code is run whenever the data table is update or records are added to it.

Your choice of using a query or VBA code will depend on the number of records you need to perform the calculations on. The more records you have, the more likely the use of VBA code will be preferable. YMMV.

Here is a table showing some data and running calculations. In this example, the running calculations are performed over Group and SubGroup.

Group SubGroup Value Running Count Running Sum Running Average
A 1 1 1 1 1
A 1 2 2 3 1.5
A 1 3 3 6 2
A 2 2 1 2 2
A 2 4 2 6 3
A 2 6 3 12 4
B 1 3 1 3 3
B 1 6 2 6 4.5
B 1 9 3 9 6
B 2 4 1 4 4
B 2 8 2 12 6
B 2 12 1 24 8

In this example we have what I'm calling Group fields and Sort fields.

  • Group Field: Fields we are summing accross, e.g. in the table above Group and Subgroup are the group fields.
  • Sort Field: Fields we are sorting by to determine the order in which we are incrementally adding to our running sum, i.e. in the table above, the value itself is the sort field (summing from the smallest to the largest value).

Here is the SQL you need to create the same calculations as in the example table

SELECT tblData.ID, tblData.Group, tblData.SubGroup, tblData.DataDate, tblData.Data, 
			(Select Sum(a.Data) 
                          From  tblData  as a
                          Where  a.Group = tblData.Group 
                            and a.SubGroup = tblData.SubGroup 
                            and a.DataDate <= tblData.DataDate) AS RunningSum, 
			(Select Count(a.Data) 
                          From  tblData  as a
                          Where  a.Group = tblData.Group 
                            and a.SubGroup = tblData.SubGroup 
                            and a.DataDate <= tblData.DataDate) AS RunningCount, 
			[RunningSum]/[RunningCount] AS RunningAverage
FROM tblData
ORDER BY tblData.ID, tblData.Group, tblData.SubGroup, tblData.DataDate;

Let's take a look at the RunningSum subquery. Once you understand this part, you should understand the RunningCount subquery as well.

Select Sum(a.Data) 
From tblData as a
Where a.Group = tblData.Group 
  and a.SubGroup = tblData.SubGroup 
  and a.DataDate <= tblData.DataDate

The first thing to understand is that a correlated subquery that is a query field (in the SELECT portion, not WHERE), is evaluated for each record returned in the query. So, let's say you have this data returned in query:

Group Subgroup DataDate Data
A 1 1/1/2018 1
A 1 1/2/2018 2
A 1 1/3/2018 3
A 1 1/5/2018 4

Now let's look at what happens for our RunningSum 'subquery field' for the third record. We can see that all the records share the same Group and Subgroup. So the following parts of the SQL statement are satisfied.

Select Sum(a.Data) 
From tblData as a
Where a.Group = tblData.Group 
  and a.SubGroup = tblData.SubGroup
  and a.DataDate <= tblData.DataDate

But, only the first three meet the DataDate criteris.

Select Sum(a.Data) 
From tblData as a
Where a.Group = tblData.Group 
  and a.SubGroup = tblData.SubGroup
  and a.DataDate <= tblData.DataDate

So the result of the subquery for the third row is 6, i.e. 1 + 2 + 3. The same operations hold true for RunningCount, resulting in a running count of 3.

There are some things to consider in our current subquery statement. Let's say we have this set of data:

Group Subgroup DataDate Data
A 1 1/1/2018 1
A 1 1/2/2018 2
A 1 1/3/2018 3
A 1 1/3/2018 30
A 1 1/5/2018 4

Now we have two records with the same Group, Subgroup, and DataDate. Both this records will be included in the subquery results both of those records, so the total would be 36, i.e. 1 + 2 + 3 + 30. This is a problem if ALL the records have the same value for the grouping AND sorting fields.

If we are only concerned with the grouping and running sum by date (DataDate) and not individual records records, then all we need to do is use a SELECT DISTINCT query.

SELECT DISTINCT tblData.Group, tblData.SubGroup, tblData.DataDate,
			(Select Sum(a.Data) 
                         From  tblData  as a
                         Where  a.Group = tblData.Group 
                           and a.SubGroup = tblData.SubGroup 
                           and a.DataDate <= tblData.DataDate) AS RunningSum,
			(Select Count(a.Data) 
                         From  tblData  as a
                         Where  a.Group = tblData.Group 
                           and a.SubGroup = tblData.SubGroup 
                           and a.DataDate <= tblData.DataDate) AS RunningCount,
			[RunningSum]/[RunningCount] AS RunningAverage
FROM tblData
ORDER BY tblData.Group, tblData.SubGroup, tblData.DataDate;

[Note: See the qryRSum_Distinct and qryRSum_NotDistinct queries in the sample file. It's worth noting that DSum() will aso have the same problems we've seen so far with the subquery method.]

But let's go ahead and assume that we do need the running sum values for all individual records.

The way to solve this is to have one field for every record that is unique. The obvious answer is an autonumber field (if the data comes from one table). If multiple tables are used in your query, you'll need to figure out which table contains the 'Data' field and use it's Autonumber field (assuming there is one). Concatenating fields could work but isn't foolproof, e.g. what if both values for the 1/3/2018 records were the same, then Group+Subgroup+DataDate+Data would be the same.

I cannot stress it enough, that you need a uniquely identifying field to include and it must be sortable to be used as a sort field. Assuming it's an Autonumber Field named ID, we'd revise the SQL statement as such:

Select Sum(a.Data) 
From tblData as a
Where a.Group = tblData.Group 
  and a.SubGroup = tblData.SubGroup
  and a.DataDate <= tblData.DataDate>
  and a.ID <= tblData.ID

However, this won't work either. While an autonumber would be unique and sortable, the problem lies in that the sorted data may result in data not being included in our RunningSum result. Let's add an ID and sort by the following fields in the following order:

  1. Group
  2. Subgroup
  3. ID
  4. DataDate
ID Group Subgroup DataDate Data
1 A 1 1/1/2018 1
2 A 1 1/3/2018 3
3 A 1 1/3/2018 3
4 A 1 1/2/2018 2
5 A 1 1/5/2018 4

The result of our modified SQL subquery for the second row would 4, i.e. 1 + 3. The 1/2/2018 data doesn't get included because the ID is greater than the ID for the second row. What we need is some number that unique and sortable (by our grouping needs). Luckily we have something that kinda works, i.e. RunningCount. It will need some major tweaking since, as it stands now, it has the same pitfalls as our current RunningSum subquery.

[Note: See the 'qryRSum_Autonum_SumOrder' query in the sample file, it shows you the order things are summed in the RunnngSum field. Take a look at results for Group = A and Subgroup = 3. See if you can spot the problem with the results for ID = 10 & 11. Try to understand why it occurs.]

At this point (actually, even before this point), things get tricky to put into words. It's much easier to look at the data and see how it works for yourself. I recommend downloading the sample file to follow along.

Firstly, we have to treat the rows with duplicate groupings (Group & Subgroup) AND sort field (DataDate). Remember, the autonumber ID isn't meaningful in itself other than to identify unique records with the same groupings and sort value (Group, Subgroup, DataDate).

So, we need to do two seperate RunningCounts (again, assuming our current row is the same 'third row' data from the first table in this article, the one with ID = 2):

  1. RunningCountLESSTHAN: We do a running count on the data with the same Group+Subgroup but DataDate<DataDate for the 'current' record , i.e. rows 1 and 2 in the table below.
    (We don't want to include the current record because then we'd have to include ID to differentiate the different records but that would be problematic for records with DataDates < the current record's DataDate of 1/3/2018, e.g. 1/2/2018 with ID = 4)
    • Select Count(a.Data) 
      From  tblData  as a
      Where  a.Group = tblData.Group 
       and a.SubGroup = tblData.SubGroup 
       and a.DataDate < tblData.DataDate
  2. RunningCountEQUAL: We do a running count on the data except for data with same Group+Sort+DateValue for the 'current' record. i.e. rows 3 and 4 in the table below.
    • Select Count(a.Data) 
      From  tblData  as a
      Where  a.Group = tblData.Group 
       and a.SubGroup = tblData.SubGroup 
       and a.DataDate = tblData.DataDate
       and a.ID <= tblData.ID
  3. Add the two of them together to get our RunningCount
ID Group Subgroup DataDate Data RunningCountLESSTHAN RunningCountEQUAL RunningCount
1 A 1 1/1/2018 1 0 1 1
4 A 1 1/2/2018 2 1 1 2
2 A 1 1/3/2018 3 2 1 3
3 A 1 1/3/2018 3 2 2 4
5 A 1 1/5/2018 4 4 1 5

The new running count is now unique and sorts the data in the order we want our running sum to be summed, so it can take the place of using ID. But the revised RunningCount must be calculated before using it in RunningTotal so we have 1 additional intermediate query to create (Here I'm calling it qryRunningCount for clarity. Used a different name in the sample file.).

Select Sum(a.Data) 
From qryRunningCount as a
Where a.Group = qryRunningCount.Group 
  and a.SubGroup = qryRunningCount.SubGroup
  and a.DataDate <= qryRunningCount.DataDate>
  and a.RunningCount <= qryRunningCount.RunningCount

[Note: See the 'qryRSum_RCountRNew' and 'qryRSum_RunningSum' queries in the sample file. Also you may notice that the RunningCount value isn't unique for all records! That's OK, what we really need is for it to be unique within each grouping of our group fields (again, this is my nomenclature, not a technical term as far as I know)]

I mentioned this earlier, but it's worth repeating. Doing these running counts and sums in a query isn't very efficient and will be slow if you have many records. Proper indexing of fields can help improve performance (see links). The same result can be done in a report (see sample file), but then you can't use it for any other calculations you may need. If it is something you need to use for calculations, I'd recommend adding the results to a table (using a make-table or append query), so that the calculation overhead only needs to happen once. Alternatively, you can write code to create the table with the running calcs, but that's another article for another day (maybe).

To do the running sum in a report, simply add the groupings and sorting you need and add a textbox in the detail section and do the following:

  1. Set the recordsource as =Sum([Data])
  2. Set the running property to Over Group

To do the running count in a report, add another textbox in the detail section as you did the running sum textbox, but set it's record source to =1. Add a textbox that uses the running sum and count textboxes to calculate the running average.

You have to figure out if you need it in a query or just want to see it in a query. Sometime people want to see it in a query, but what they really need is just a report with the results.

Alternative method - Specific conditions

I recently learned that there is a faster method than using a subquery, BUT your data MUST meet certain conditions for it to work (at least, as far as I can tell):
NO duplicate sort field values! For example, in the sample data we've been discussing, no duplicates of DataDate within our grouping of Group+Subgroup can occur. An "A, 1, 1/3/2018" combination can only occur ONCE.

For the details on that read the forum post by CJ London (link is below). And then look at 'qryRunSum_CJ_Landon' query in the sample file. Compare those results to the results of the 'qryRSum_Distinct' query. (Hint: Only the results for the "A,2" grouping will be correct in the former query)

[Note to self: Find a better way to get data from an Access table into an HTML table than typing it all out manually in Notepad ++]

Links