The data shown in the graph represents an average of the high and low temperature predictions for the given day, up to 7 days ahead, in Fahrenheit. Some services offer predictions 10 days or more, but two (WTHR & NWS) only offer only 7 days of predictions, so I chose that as my limiting factor. One exception to this is that WTHR and NWS do not offer the 7th-day low prediction (at least not when I collected the data at about 9 AM each day), so the graph below represents a full 7-day prediction for the other 3 services, but only represents 6.5 days for WTHR and NWS. Further, while the data was collected for a total of 12 days, there are not 12 days of data for each of the 7 days of predictions: there are 12 data points for day 1, 11 data points for day 2, and so on, until there are only 6 data points for day 7. The high and low temperature for each day comes from the NWS official record (click "Get Climate Data" for January 2018--a pdf of the month's summary so far will download). The temperature deviations on the graph represent an average of the difference between the high and low temperatures predicted, versus the actual high and low temperatures as recorded by the NWS.
As can be seen, all services perform about the same from Day 1-Day 5, about 3-6 F deviation for each day, on average, although the local NBC-affiliate, WTHR, performs slightly worse by the 4th-5th days. By the 6th day, WTHR and the NWS have clearly started to diverge, and by day 7, all services range from 8-13 F from the actual temperature, with the NBC local-affiliate being by far the least successful, and Intellicast and Weather.com being the most. The 7th-day temperature deviance is even more striking for WTHR and NWS given that they alone do not provide a 7th-day low temperature prediction, but the other services do, yet they still perform better.
This was different from what I expected--I presumed that the local TV station and NWS predictions would be the best, since they represent local experts on-the-ground making a prediction, whereas the other web-based services I presume are simply mass algorithms produced by a computer for each zip code. However, my assumptions about how the data is produced by each service may not be correct. Further, there may be differences in how each day's temperatures are differentiated. For example, in the past, WTHR used to cut-off it's daily low temperatures at midnight, but that has now changed, so that the low temperature prediction extends into the next morning. The lowest nightly temperature often is not until about 6-7 AM. In a technical sense, this is the "next day's" temperature low, however, intuitively, when we look at a low prediction, we are expecting the lowest temperature for a given "next night," not the "previous early morning." The prediction services do not specify when their cut-off time for low predictions are, however, they seem to be relatively consistent. However, this may impact how accurate my data collection indicates each service being. Further, there may be micro-geographic differences in where temperatures are collected that produce significantly different results--northern Indianapolis may be cooler than southern Indianapolis. The services do not specify the locations from where they provide their temperatures. Since the services do not all provide a "past-observations" feature, my decision was to only gauge past observations from the National Weather Service to compare all services. This clearly does not benefit the NWS predictions outcome in this study, given that they perform the 2nd-worst of all 5 services.