San Francisco, CA
April 8, 1998
Note: This report is also available in a Microsoft Word version.
Note: The 1997 version of this report contains a Technical Appendix that describes the statistical issues surrounding this study in detail.
In the last year, the San Francisco Municipal Railway (Muni) has attracted a large amount of attention from the press, public, and elected officials. Because San Francisco is an extremely high-density city, public transportation is a critical city service, used hundreds of times a year on average; its unreliability remains a severe sore point for San Franciscans. In contrast to previous years, 1997 saw a flurry of activity by the City and Muni management aimed at improving service, including significant budget increases, the introduction of new light-rail vehicles, crackdowns on traffic congestion, and the completion of one portion of the Muni Metro expansion program, the E-Embarcadero line.
To assess whether these initiatives are making a difference, RESCUE MUNI conducted the 1998 Muni Riders' Survey in February. This survey attempts to measure Muni's reliability from the rider's perspective. For the first two weeks of February, over 100 volunteers recorded how long they waited for the buses and streetcars that they used every day, and a few watched vehicles go by and recorded the headways. We then compared these data with the frequencies advertised in every Muni shelter and our results from last year.
Unfortunately for Muni and its customers, the survey shows quite clearly that Muni reliability has not improved since 1997, and Muni Metro reliability has significantly declined. While some lines (and modes) showed improvement, the experience of the Metro rider has gone from bad to worse, particularly at rush hour. This is of particular interest due to recent progress in the Metro construction and car replacement project.
This year's survey participants experienced delays 28 percent of the time, slightly more than last year's figure of 25 percent. (For 28 percent of all rides taken, the participant waited longer than Muni's total advertised frequency.) This earned the Railway a total score of C, the same as in 1997. On the Metro, however, 35 percent of riders were delayed, almost half again last year's score of 24 percent; this significant decline earned the Metro a D. At rush hour, the Metro was even worse: 57 percent of PM rush riders (of all lines) experienced a delay, for example. As we pointed out last year, this means that riders who take Muni to work every day can expect to be delayed every other day, or every day if they transfer; riders who take Metro to work can expect delays three times a week.
Particularly striking was the performance of Muni's above-ground light-rail lines. All major Metro lines (except the E and the KLM underground between West Portal and Embarcadero) were among the bottom ten, and all but one (the M) were graded F. The worst-performing Metro line, the L-"Terrible", delayed its passengers a staggering 53 percent of the time, 31 percentage points worse than in 1997, and the others were not far behind. Other modes were less uniform in their performance, with some bus lines doing particularly poorly (the 14-Mission, 41-Union, and 48-24th St/Quintara were all graded F) and many other lines, including the F-Market historic streetcar, improving or holding steady since 1997.
In Table 1, as we did last year, we have listed overall system performance and a selection of lines. Later in this paper, we will provide a complete ranking of routes, and we will also discuss Muni performance by mode (type of line), time of day, and some other criteria.
Table 1: Muni system-wide reliability, top and bottom five lines
|route||1998 % late||grade||1997 % late||change||1998 # responses|
|System-wide total||28%||C||25%||+ 3%||3004|
|Worst 5 routes:|
|Best 5 routes:|
* 1997 score and comparison may not be as valid due to small 1997 sample size (<20 data points).
Using the same data-gathering method as in 1997, RESCUE MUNI volunteers recorded how long they waited for their buses or streetcars, or in some cases, watched vehicles go by and recorded the times at which they passed. 147 volunteers recorded 3004separate vehicles, over twice the number recorded in 1997. Participants submitted their result either on Muni Riders' Survey forms or via the World Wide Web, and we built a Microsoft Excel database from the data we received.
For each ride, we calculated waiting time and compared it to the frequency published on Muni's Street and Transit Map and in bus shelters. From this comparison, we calculated the frequency at which riders wait longer than the posted frequency, the average waiting time, and the average normalized waiting time - the waiting time divided by the advertised frequency - for each line with at least twenty data points. For data collected by volunteers watching buses and streetcars go by (1110 of 3004 data points), we used a system of weighted averages to determine the probability that a rider arriving at random during the interval monitored would be delayed and the expected wait time of such a rider.
We then assigned letter grades based on the percentage of riders delayed, applying the traditional grading scale to the percentage of riders not delayed. (²10% delayed = ³90% on time = A, etc.) Finally, we compared these data with our data for 1997 for all lines and modes with sufficient data. The result is a useful measure of the 1998 performance, and the improvement or decline in same, of all modes but cable car and thirty-six unique lines. All of this is done to measure the typical rider's experience: after all, a rider has little interest in whether a bus left the terminal on time, but he or she cares quite a bit how long the wait will be.
For a more detailed discussion of this methodology and how it compares to other measures of on-time performance, please see our Technical Appendix, published with last year's survey report and available from RESCUE MUNI.
Has Muni improved? This is an important question for San Francisco, given the attention that the Municipal Railway has received in recent months. Unfortunately, our data show quite clearly that there has been no significant improvement in Muni system-wide reliability since 1997. The Municipal Railway continues to run in an erratic fashion and, seemingly at random, keep its riders waiting far longer than they should reasonably expect to. For the commuter who has a car ready in case of a delay, this may not be so bad, but for the car-free San Franciscan, this makes the decision to do the right thing for the environment particularly frustrating. Of course, for San Franciscans who cannot afford to own a car, this represents a far more significant problem.
In 1998, as stated above, the system caused passengers to be late 28 percent of the time, slightly more than in 1997. Of the 3004 rides taken, 832 had waiting times longer than the frequency advertised on the system map. This represents a slight increase since 1997, when 347 of 1365 rides (25 percent) were delayed. A t-test found that this difference was not quite significant within a margin of error of 5 percent (P=0.059), leading us to conclude that overall Muni reliability is not significantly different from last year.
Another way to analyze the system is to measure the amount of
time that a rider is likely to wait. 28 percent of Muni riders
wait more than the posted frequency, but many wait much more than
that. To analyze this problem, we calculated the normalized
waiting time for each line by dividing the amount of time
waited for each vehicle by the posted frequency. We found that
our participants waited 85 percent of the posted
interval, slightly more than the 80 percent waited last year. (A
system running perfectly would score 50 percent, because riders
would be expected to arrive at random within an interval that is
never exceeded.) In this case, the difference between the means
is statistically significant: a t-test (P=0.04) shows
that participants waited longer, relative to the
schedule, in 1998 than in 1997.
Muni's reliability also varies significantly depending on the mode (type of vehicle) used and the time of day. This difference was particularly pronounced when we compared the Muni Metro with other modes. As we have mentiond, the Metro is substantially less reliable this year, earning a grade of D with 35 percent of riders delayed. In contrast, the collective scores for the bus lines hardly changed at all since 1997 (see Table 2); all were graded C last year and this year. One bright point was the F-Market historic streetcar; it improved from a dismal 39 percent of riders delayed in 1997 to 13 percent this year, earning a B. In Table 2, we have provided reliability figures for all modes and the difference between 1997 and 1998 performance.
Table 2: Riders delayed by mode
|light rail (metro)||35%||D||24%||+11%||712|
|trolley coach (electric)||27%||C||26%||+1%||958|
* The 1997 sample size for limited-stop buses was very small (4 responses), so this comparison is not significant.
**The sample size for cable cars is too small for this to be significant. Data are included to provide a complete list.
Riders of the Muni Metro waited substantially more than riders of other lines, relative to the posted schedule. The average Metro rider waited 109 percent of the full posted frequency, more than twice as long as such a rider would have waited had the system been running perfectly and one-third worse than in 1997. In contrast, the typical rider of an express bus waited 79 percent of the full posted frequency, representing an improvement over the 1997 score of 90 percent. Local motorcoach and trolley coach lines were effectively unchanged since 1997. Again, the F-Market historic streetcar showed the most improvement: riders waited 54 percent of the posted interval on average, an almost perfect score and slightly over half the amount waited in 1997.
Reliability also varied significantly by time of day. In particular, Muni was much less reliable at rush hour, when it is the most crowded; 30 percent of morning and 38 percent of evening rush-hour riders experienced delays, earning those periods a score of D. Half again as many riders were delayed in the evening rush, for example, as were delayed during midday periods (21 percent delayed, graded C) or evenings and weekends (22 percent delayed). This contrast was particularly striking with the Metro; as stated above, 36 percent of riders were delayed in the morning and 57 percent in the evening rush hours, while only 26 percent were delayed in the evenings. Expresses were much worse in the mornings (33 percent late) than in the evenings (22 percent). In Table 3, we have provided reliability figures for all time slots and the change since 1997; in the Appendix, we have a complete table showing the differences by mode and time of day.
Table 3: Riders delayed by time of day
|time of day||1998
*The sample size for owl service is too small for this to be significant. Data are included to provide a complete list.
We also analyzed the probability of waiting twice or three times the posted frequency to assess the severity of the delays that do happen. The average frequency of a bus line does not always reflect poor performance. Very often, after a substantial dry spell, Muni buses come in groups, one after the other. Even though the average frequency for the buses might be recorded as close to Muni's ideal, many riders would have waited far beyond the posted time.
We calculated the chance that a rider will wait a very long time for a train or bus. The data show the chances an average rider will wait twice and three times the posted frequency. It should be noted that a wait of twice the posted frequency is a very long time, since the average rider should really only wait half the frequency, since some will arrive just before the bus comes, where others will arrive just after the last bus. These data indicate that a train with a posted evening frequency of 20 minutes could take 40 minutes to arrive. Not only is this unpredictability inconvenient, but it can be very dangerous as riders wait in the dark.
2x Posted Frequency: These data show the likelihood a rider will wait twice Muni's posted frequency or longer. In all cases, performance has declined since 1997, not just on isolated lines but throughout the system. System-wide PM Rush hour performance was particularly bad, most likely due to full trains and buses leaving behind passengers. The N-Judah streetcar declined dramatically since last year, with an 18% chance that a rider will experience a substantial wait, up 12 percentage points from last year.
Table 4: 2x Posted Frequency
|All PM Rush||16%||12%||+4%|
3x Posted Frequency: A wait of three times the posted frequency is almost incomprehensible, and yet it happens with some regularity and more often than last year. The worst offender is the J-Church line with an 11% chance that a rider will wait longer than it would take to walk to most intermediate destinations.
Table 5: 3x Posted Frequency
|All PM Rush||8%||4%||+4%|
Our analysis found significant differences between different Muni lines, even of the same mode. While systemwide performance rated a C based on the teacher's grading scale, seven lines rated an F and five rated a D. (We excluded lines for which we received fewer than twenty responses from this analysis.) Lines graded F were the 41-Union, L-Taraval, 14-Mission, J-Church, N-Judah, K-Ingleside, and 48-24th St/Quintara, and lines graded D were the 14X-Mission Express, M-Ocean View, 15-Third, 71-Haight/Noriega, and 24-Divisadero. The percentage of riders late, and the grade derived from it, are noted in Table 4. Also of note is the average normalized wait time for these lines: with the exception of the K, riders of all lines graded F can expect to wait more than the total posted frequency on average, clearly not an acceptable figure.
In contrast, some lines did reasonably well. Only two of 36 lines, the 2-Clement and 44-O'Shaughnessy, were graded A this year; however, eight lines were graded B, including the F-Market, 7-Haight, and the underground portion of the Muni Metro. (This line, sometimes referred to as "KLM", reflects the experience of Metro passengers who can board any streetcar between West Portal and Embarcadero.) On these lines, passengers were not often delayed, and they did not typically wait very long; riders on all lines graded B or A except the 7 waited less than 60 percent of the posted frequency on average. The 2-Clement stands out: not only were only 9 percent of riders delayed, riders waited less than half (44%) the posted frequency on average.
Table 6: Ranking of lines by on-time
(Excluding lines with fewer than 20 responses)
Note: Click on the hyperlink for each line to get transitinfo.org schedules. Please note that we used the Muni map, not these schedules, to assess performance in this survey.
|route||% late||grade||1997 % late||1997 grade||change in % late||avg waiting time||avg normalized wait time||total responses|
* This line had insufficient data in 1997, so the comparison is not valid.
To demonstrate how routes differ from one another, and to show how different routes have changed since 1997, we have identified eight example lines; four did quite poorly, two were close to the Muni average, and two were good performers.
This is Muni's most heavily-used bus line, carrying nearly 40,000 passengers a day in a corridor which stretches all the way from downtown San Francisco to Daly City. Despite the deployment of new articulated trolley buses and the refurbishment of trolley wires in the last couple of years, over 50 percent of passengers waited longer than the posted service frequency and bus overcrowding is a matter of course. A 14 bus should arrive nearly every five minutes during the day, but our survey shows that the average wait, which should be near half that figure, is over six minutes. The sheer number of buses operating along the length of Mission Street is impressive, but not nearly as impressive as the number of people waiting for them.
As in 1997, this was the most widely reported line in the survey, but the level of service on the line has slipped from last year's "D" to an "F", with 42 percent of riders delayed. The average wait for a train has slipped from 6 minutes last year to over 10 minutes, and passengers can expect to wait nearly two and a half times as long as Muni's advertised service frequencies indicate. (Riders waited 123 percent of the posted frequency on average.) Carrying nearly 40,000 passengers a day, the N is Muni's busiest streetcar line and its chronic service problems -- missed train runs, overcrowding, erratic headways -- reflect the difficulties with Metro service city-wide. Muni has promised for years that progressive replacement of its ageing Boeing streetcar fleet with new Breda cars would help, but despite over 50 new cars in service the experience of riders continues to worsen.
This widely-traveled line received a grade of F, down from last year's grade of C, with 42 percent of riders experiencing a delay. Almost twice as many riders experienced a late train this year versus last year (22 percent). The average wait time for a J car was 10 minutes, above last year's time of 7 minutes. More significantly, however, riders waited on average 2.8 times the amount they should have; the average normalized wait time almost doubled, to 138 percent of the posted wait time versus 76 percent last year. This means that if you're waiting for a J car when the posted wait time is 10 minutes, you can expect to wait almost 14 minutes on average, up from last year's expected wait of about 8 minutes. A line running perfectly would find you waiting five minutes on average.
Drivers on the 24 line have been known to call out the stop at Castro and 24th as the Transfer Point for the "Phantom Forty Eight", and it lived up to that reputation this year. 40 percent of riders on this line waited more than the posted frequency; this was down slightly from last year, but the data in 1997 were not sufficient to draw a fair conclusion. The average wait time of survey participants this year is 13 minutes, up from last year's 12 minutes. One should expect to wait the full posted time, which ranges from 12 to 20 minutes between 5:20am and 12:20am; riders have reported that it is quite common, however, to wait over an hour for this bus. The maximum reported time in the survey was 35 minutes in a period when 20 minutes was advertised.
This year, 30 percent of the 24 line survey participants experienced a longer waiting time than posted, which earned the 24 line a grade of D, down both from last year's 23 percent of survey participants who waited longer than posted, and from last year's C grade. Be prepared to wait half again as long as you should (at least 86 percent of the posted waiting time on average). 24 line operators earn kudos for friendliness, particularly during the Owl Service.
This line was the worst performer in 1997 and its improvement from an "F" grade is encouraging. 1998 survey respondents could expect to wait "only" 2.15 times as long as should be expected, down from 2.3 times in 1997, a 25 percent improvement. (Normalized waiting time declined from 116 percent to 108 percent of posted frequency.) Though this well-used and important cross-town bus route operates in a number of challenging traffic conditions which make schedule-keeping difficult, our survey shows that there is plenty of room for further improvement.
Mayor Brown has called this line both an example of "what San Francisco's Muni can be," and "a toy." Attention to improving the F line has paid off for the city, however, in that this year's survey participants report significant improvement in service over last year. In fact, the F line represents some of the best that Muni has to offer. 13 percent of riders reported having to wait longer than the posted interval for a car, and riders wait on a daily average 54 percent of the posted interval, just 4 percent more than the ideal. Last year, the F line received a grade of D, based on 39 percent of riders reporting a longer wait than posted, and an average normalized wait time of 102 percent. This made the F Muni's most-improved line.
The Muni employees and volunteers who operate, maintain, and clean the F line warrant commendation, not only for the historic streetcars' gleaming presentation, but for their significant improvements in service reliability as well. As Mayor Brown observed, it is somewhat still a "toy," in the sense that it serves a limited function within the transportation picture of the city; but it is also a shining example of what the city's public transportation system can become, with good leadership and appropriate allocation of resources, both fiscal and human.
Only 9 percent of our riders reported waiting longer than the advertised bus headway on this line, a real improvement over last year's D rating (31 percent). This is one of the few lines where rider's experiences and Muni's claims seem to meet, and we hope to see more in the future. Though this diesel cross-town bus line is in some ways similar to the poorer-performing and comparably-lengthy 22 and 24 trolley bus lines, it carries far fewer passengers and operates on less-congested streets to the west of Twin Peaks. This both suggests that an emphasis on enforcing priority of Muni vehicles on jammed streets like Market and California could have a real payoff for San Francisco, and also reflects overall equipment availability and reliability problems on the city's trolley-bus lines. The 44 line passes in front of the museums of Golden Gate Park, whose management could be more aware of the quality of service Muni can provide in the right circumstances.
As one might expect, riders were generous with their comments on Muni service quality during the service period and in general. Many commented on crowding; we received 263 comments from riders that buses were "crowded", "SRO", "full", "packed", or "sardines". 40 participants reported bunching of some sort, and 21 explicitly reported that their vehicle was late. Several participants provided positive feedback as well; 40 riders reported that their ride was "good", "nice", "great", or "courteous".
Many riders reported trouble with Muni:
Others reported a good experience, particularly in the area of courtesy:
Some reported on the nasty weather San Francisco was experiencing at the time:
Unlike in 1997, very few participants reported that Muni appeared to be "on its best behavior" (a 1997 participant) during the survey period.
With the survey data, these comments tell the story of a system that frequently breaks down and causes trouble for its users. Muni Metro riders were particularly vocal, which is no surprise given the Metro's unacceptable performance this year. Most positive comments were about courtesy, not timeliness, suggesting that at least Muni's program to encourage professionalism among staff may be working.
This survey makes it clear that Muni's customers continue to experience very poor service on some routes, and that the Metro is entirely failing to live up to the expectations Muni management is currently setting. Critical structural and operational issues that we believe are having an impact include:
Metro improvement: This project, with its three interdependent components (new cars, new trackage, and the Advanced Train Control System) is clearly causing difficulties for existing users. Below are some elements of the project and the problems they may be causing.
Traffic: This is an area in which the city appears to be making a difference. While several high-ridership lines in heavy-traffic areas (14, 15) did poorly, the dogs of 1997 (22, 1) seem to have benefited from the city's increased enforcement of traffic laws downtown. The F-Market streetcar is potentially the best example of this; the transit-only lane is (mostly) enforced now, and its reliability has improved dramatically. Of course, traffic does not significantly affect the Muni Metro, so it can't be used as an excuse for this year's poor showing.
Budget: In 1997 Muni received a $17 million increase in its operating budget, primarily to hire new drivers and supervisors. Muni continues to claim that it is "underfunded" relative to the inflation rate, and while this is true to a certain degree, this survey raises important questions about the claim that a lack of money is the root of Muni's problems. In particular, although the city has invested hundreds of millions of dollars in new Muni capital projects, it is as yet unclear that users will benefit in any significant way. While there are important distinctions to be made between capital and operating budgets, and while one can certainly see the potential benefit to spending additional monies on operations, the taxpayer and the farepayer should retain a healthy skepticism.
Allocation of Resources: Muni suffers from a staff shortage, and yet operators read the newspaper in the second and third cars of underground trains to Embarcadero. Muni suffers from a car shortage, delaying thousands of people per day on all Metro lines, and yet six brand-new Breda cars sit mostly idle on the E line. Something is seriously wrong with this picture.
RESCUE MUNI calls for the following steps from Muni and the city to alleviate the problems discussed here.
Muni needs to take a hard look at the effects the Breda, ATCS and MMX projects are having on the rider community. Has the opening of the E line, despite its clear attractiveness as a showpiece, actually exacerbated car shortages? If so, E service needs to be sharply cut back, or even suspended, until Muni has enough cars. Is the switch to Breda causing Boeings to fail at a higher rate due to capacity problems in maintenance? If so, Muni needs to add enough temporary maintenance workers to fill the gap, or else slow down Breda procurement. Is ATCS making matters worse? If so, Muni should consider putting it on hold until some of the other problems have been solved. In any case, Muni must refocus its Metro efforts on delivering the advertised service today - and if that means delays in the new projects, so be it.
Transit-only lanes and parking enforcement in transit-rich zones like downtown seem to be working. The City should continue to expand this program; in particular, the experiment to close a portion of Market Street during rush hour is well worth a try, as is the program to put parking control officers on buses (e.g. the 1-California) through heavy-traffic neighborhoods. It is critical, however, that increased parking enforcement be directed against vehicles that actually block buses - not vehicles most likely to meet the Department of Parking and Traffic's ticket quota.
An increase in Muni's budget comparable to last year's probably couldn't hurt - but it may not help. The Mayor and Board of Supervisors must assign real performance standards to this year's budget to ensure that Muni is spending SF taxpayers' hard-earned dollars properly. In particular, the new superintendents of the various modes should be compensated in part based on standards met: on-time performance, missed runs, safety, and so on. This principle of measurable, enforceable standards is particularly important if Muni should attempt to raise fares, which it has suggested that it may do in the near term.
The 1998 Muni Riders' Survey, conducted by 147 volunteers in early February, demonstrates that the San Francisco Municipal Railway has failed to show significant improvement in service reliability since 1997. Riders continue to suffer delays with frustrating regularity; 28 percent of participants in the survey were so affected, earning the system a grade of C. Muni Metro (light-rail) riders had a worse experience, experiencing delays 35 percent of the time, significantly more than in 1997; the Metro was graded D. The problem was particularly acute at rush hour, with 38 percent of all riders and 57 percent of Metro riders delayed on their rides home.
So is Muni getting better? We wish we could say yes, but we can't. This has significant policy implications for San Franciscans and their government. Is the current Muni organizational structure appropriate for running a railway? In today's organization, is it even possible for a director of public transportation to demand the kind of accountability that we so clearly need? Can the city be trusted to meet its commitments on the Metro this time, having clearly failed to in 1997? Are the Mayor and the Board of Supervisors, despite their well-documented good intentions, actually getting in the way of real Muni reform? This survey does not attempt to tackle these questions, but a skeptical public has every right to demand the answers.
Survey Coordinator Andrew Sullivan (415-673-0626, email@example.com) designed and conducted the Muni Riders' Survey and wrote most of this paper. Special thanks go to:
... and the 147 volunteers who stood out in the rain for better Muni service.
[ RM Home Page | Press Release ]
Copyright © 1998 RESCUE MUNI. All rights
This document was formatted for the Web by Richard Mlynarik. This site is maintained by Andrew Sullivan.
Questions? Send us >email.
Last updated 4/22/98.