Black Friday traffic exposes gaps in observability strategies
Peak loads can derail customer transactions. Retailers have turned to observability platforms to identify root-cause issues with Black Friday traffic. Source: Lauren Horwitz -Dynatrace blog What’s the problem with Black Friday traffic? If the mantra in sales is “Always be closing,” the mantra for online retail storefronts is “Always be online.” But that’s difficult when Black Friday traffic brings overwhelming and unpredictable peak loads to retailer websites and exposes the weakest points in a company’s infrastructure, threatening application performance and user experience. In the U.S., the five days between Thanksgiving and Cyber Monday historically bring an onslaught of online activity. Following record-breaking growth of 31.8% in 2020, ecommerce grew 14.2% in 2021. Additionally, the median online sales growth for the five years leading up to the pandemic was 14.2%. As Alois Reitbauer, chief technology strategist at Dynatrace, noted in 2020, organizations shouldn’t be caught off guard during Black Friday and other high-volume times. “Organizations need to prepare for both expected and unexpected demand, not only for the services that their customers and users rely on today but for the services being developed for tomorrow,” he wrote in a blog on Black Friday traffic. Why Black Friday traffic threatens customer experience Peak loads can overload and crash retailer websites and derail customer interactions. These kinds of problems are unacceptable. Customer experience has become paramount for retailers, as visitors demand instant responses — especially during times of high volume. Digital customer experience “is everything,” as a recent PwC report indicated. Further, a recent study showed 59% of customers switched to competitors after a few bad experiences, and 17% left after a single bad site visit. Moreover, website performance problems during peak times have a clear economic impact. According to Forrester’s report, “The costs of planned and unplanned downtime,” downtime can cost organizations millions of dollars. In 2021, nearly 180 million Americans shopped online and in person during the Black Friday period, according to a report by the National Retail Federation and Prosper Insights & Analytics. Identifying Black Friday traffic blind spots with modern observability One North American electronics retailer using the Dynatrace observability platform adopted the technology after they experienced site performance issues that were stopping customer transactions in their tracks. On Thanksgiving Day, monitoring tools captured logs of Black Friday traffic. But monitoring provided an incomplete picture: The retailer’s existing tools indicated a problem-free customer experience in the retailer’s stores. But employees at physical stores were reporting an altogether different story: 14-second transaction delays and a failure of one in five transactions. The company did a postmortem on its monitoring strategy and realized it came up short. As one IT team member described it, “We spent several months [asking], ‘How did we miss this?’” After extensive investigation, the company identified some of the sources of its monitoring blind spots. The IT team also learned that while “each piece in the customer lookup chain is monitored, … we did not [know] what kinds of faults customers had experienced and the root cause.” Considering Dynatrace observability for Black Friday peak loads As the company considered bringing in new tools, it had to address a reality. It already had dozens of tools in place, but these tools overlooked key customer experience issues. So, before IT teams could consider yet another tool, some buying team members recognized they needed to persuade IT that consolidating tools could still leave monitoring gaps in the production environment. Further, some on the IT team needed to be persuaded of the Dynatrace platform’s capability. IT skeptics wanted to verify that Dynatrace could address the observability blind spots missed during the Black Friday peak loads. “We were coming through the POC [proof of concept], and the retail architect stood up and said, ‘I want to see it,’” another IT team member recalled. “I’m going to log into the POS [point-of-sale system] and reproduce what happened on Thanksgiving, then log into the Dynatrace console and see the data come through.” The IT team member was apprehensive, but he agreed to put the Dynatrace platform through its paces. “It was the longest 90 seconds of my life. But we were able to see the entire thing,” he said. Another Dynatrace customer described the value of modern observability. “We’ve automated many of our ops processes to ensure proactive responses to issues like increases in demand, degradations in user experience, and unexpected changes in behavior,” one customer indicated. “Not only does this mean we don’t waste time and resources firefighting, but it also means we’re able to operate much more efficiently, leaving us more time to focus on product innovation.” Performance problems, cyberthreats may complicate Black Friday traffic in 2022 Black Friday 2022 may present its own challenges, from IT performance problems to cyberthreats. Consider recent Dynatrace data from “The 2022 CISO Research Report: Retail.” Seventy-one percent of retail chief information security officers (CISOs) said despite having a robust, multilayered security posture, there are still gaps that enable vulnerabilities to make their way into production. Moreover, 97% of retailers said they faced risk as a result of Log4Shell — the application security vulnerability that emerged as a zero-day flaw in late 2021 — and 35% cited their risk as “high” or “severe.” Best practices for navigating Black Friday traffic and peak loads Establish proper observability practices, especially for peak loads, in advance. Establishing real-time monitoring, logging, and tracing enables IT pros to identify performance problems prior to events such as Black Friday. Establish synthetic monitoring to understand the effect on users. With synthetic monitoring, IT teams can simulate user behavior on a site to detect performance issues before they affect users. Proactively monitor sites with test loads prior to Black Friday. Testing early and in various parts of a customer journey can yield insight into application and site weaknesses that could affect Black Friday traffic and threaten transactions and customer loyalty. A modern observability platform can also learn to identify these kinds of anomalies. Take inventory of key performance culprits. Consider issues such as application crashes and page load times and consider application programming interfaces as well as third-party interfaces. Any issues with APIs or third parties can cause degradations or crashes. This requires teams to monitor hosts, containers, and microservices. It also requires attention to code configuration changes, third-party services, and security updates and patches to ensure all these resources and changes don’t slow performance or cause downtime. Map out team practices to handle incidents. Too often, teams begin finger-pointing during an incident. Gathering in a war room and trying to identify which team is at fault doesn’t address customer experience. When teams plan and test for incidents, they can also document team roles and responsibilities. This allows them to resolve incidents head-on or, alternatively, prevent incidents before they affect customer experience. That moves teams from being reactive and defensive to proactive, value-adding teams in an organization. Use observability to detect runtime vulnerabilities and cyberattacks. An observability platform can help IT pros identify and prioritize cyberthreats, such as application vulnerabilities and attacks, in real time. Modern observability platforms also enable IT teams to automate vulnerability detection and remediation and detect attacks that threaten applications. For more information, check out Dynatrace’s Digital Experience Management and Infrastructure Monitoring modules.