Shopify Speed Benchmarks: Data from 1,000 Shopify Stores We Tested

Q: 1. Why do some stores with similar traffic still have different performance scores?

Even when two stores have the same amount of traffic or number of apps installed, their performance can vary significantly due to differences in how themes are coded, how assets (images/videos) are managed, and exactly how third-party integrations are implemented. The benchmark data shows that how you build and maintain the site matters as much as what you build.

Q: 2. How often should I re-run a performance/speed benchmark on my Shopify store?

Benchmarks are most helpful when you track them over time. A good cadence is: - Before a major launch or redesign - After adding or upgrading key apps - Quarterly for ongoing monitoring Keeping periodic checks ensures you catch regressions early and maintain your competitive edge.

Q: 3. Are there specific Shopify apps or integrations that consistently drag performance?

Yes. While we don't call out particular apps by name, the benchmark data shows that apps which load lots of client-side JavaScript, create numerous custom fonts/icons, or inject large inline CSS tend to have the largest negative performance impact. Always audit new integrations for their asset footprint and loading behavior. You can also check reviews of said apps on the Shopify app store.

Q: 4. How do Core Web Vitals thresholds apply in a site that's constantly changing (seasonal updates, flash sales, etc.)?

Core Web Vitals are still valid in dynamic environments: - Set a baseline (e.g., from your benchmark) and aim to improve or maintain it. - Measure during peak load conditions (e.g., flash sale) and normal conditions. - If you hit spikes (e.g., theme change, heavy app update), run the benchmark test immediately afterwards and compare to baseline. Tracking via these tests lets you ensure high-impact changes don't degrade thresholds.

Q: 5. Can I run my own PSI + CrUX performance tests from the command line?

Yes. If you're technical, you can run the exact testing workflow I used in this report with a bash script. It pulls PSI (lab data) and CrUX (field data), merges the results, and saves them into a clean CSV. However, this requires command-line experience, API keys, and basic performance knowledge. If you want the script, or want help running it, reach out to me on LinkedIn or through our site and I'll send it over.

Written by Gentian Shero

Co-founder & CSO at Shero Commerce

Shopify Speed Benchmarks: Data from 1,000 Shopify Stores We Tested

Every merchant wants a fast site. But few know what ‘fast’ really means in Google’s eyes. I wanted a clear answer for Shopify stores, not guesses or hearsay.

So I built one.

To determine what ‘fast enough’ really looks like, I analyzed 1,000 Shopify stores, excluding headless, and measured both PageSpeed Insights for lab data and Chrome UX Report for field data.

The focus was on three Core Web Vitals: Largest Contentful Paint (LCP), Interaction to Next Paint (INP), and Cumulative Layout Shift (CLS).

Those results were then combined with Lighthouse performance data to build a clear, fair comparison across all sites.

Let’s dive in.

What you’ll learn

At a glance, this report answers the following questions:

What proportion of Shopify stores meet Google’s Core Web Vitals?
I calculated pass rates using the 2025 thresholds (LCP ≤ 2.5 s, INP ≤ 200 ms, and CLS ≤ 0.1. Spoiler alert: fewer than half of stores hit all three.
Which verticals are the fastest?
Seven categories were compared: Apparel, Beauty, Electronics, Food and Beverage, General Retail, Home, and Outdoors or Sports, to see who is leading and who is lagging.
How are the key metrics correlated?
Is it mostly about LCP? Do interaction delays or layout shifts matter? Each of these metrics has been plotted against the composite score to find the strongest levers.
Why isn’t this a race?
A lightweight DTC brand with a small catalog can easily outscore a global brand with a complex checkout, hundreds of integrations, and millions of monthly visitors. I explain why context matters and how to interpret your own results accordingly.
What should merchants do next? Based on my analysis and years of eCommerce experience, I’ll share a prioritised checklist of improvements.

Throughout the article, you’ll see charts and call‑outs. Use them freely in your own presentations, pitch decks, or internal audits, but please cite the source when you do.

The terminology and abbreviations might seem hard to comprehend. The following table aims to help with that.

How the data was collected and calculated

Before we jump into the results, it’s important to understand how the sausage was made. My research setup had three core components:

Test automation. I created a Bash script that runs PageSpeed Insights for both mobile and desktop, as well as the CrUX API for both origin and URL data.
Core Web Vitals thresholds. Google’s guidance is clear: a good LCP is under 2.5 seconds, an excellent INP is under 200 ms and a stable CLS is under 0.1. These thresholds were used to flag pass/fail for each metric. When computing our composite score, each metric was normalized to its target, capped the ratios at 3 (so outliers wouldn’t dominate), and applied weights: 40 % to LCP, 30 % to INP, 10 % to CLS, and 20 % to the Lighthouse performance score.
Representative sample. The list of 1,000 Shopify sites was compiled from a combination of publicly available directories, industry rankings, and our own client base. I tried to spread the sample across verticals and regions as much as possible. It’s not a perfect representation of the entire Shopify universe, but it’s big enough to reveal patterns and small enough to run in a few days.

Both the pass/fail and composite score calculations are based entirely on mobile data, since mobile is where most shoppers experience performance friction. Desktop results are included for comparison, but they don’t influence the composite score.

The report focuses strictly on user-experience metrics. In the Google Sheet linked at the end of this report, the Composite Score Weighted (CSW) and CWV pass/fail columns are calculated automatically using formulas. To validate accuracy, the pass/fail results were re-checked against the raw LCP, INP, and CLS data to ensure consistency across our summary statistics.

The big picture: fewer than half pass Core Web Vitals

When the results across all 1,000 stores were tallied, just 48 % met the Core Web Vitals thresholds on mobile. That means more than half of Shopify merchants are leaving performance gains and potentially revenue on the table.

The median mobile LCP across the dataset was 2.26 seconds, the median INP was 153 milliseconds, and the median CLS was 0.01 seconds. While the INP and CLS medians are comfortably within Google’s good ranges, LCP hovers right at the edge.

The following chart puts that 48 % pass rate into perspective, a reminder that nearly half of all Shopify stores still fail at least one of the vital performance metrics.

Distribution of composite scores

To get a sense of how the entire cohort performs, a composite score for each site was calulated using the weighted formula. Scores can theoretically range from 0 (perfect) to 2.5 (very slow).

The histogram below shows that most Shopify stores cluster between 0.5 and 1.2, with a long tail of slower sites reaching up to 2.2. Each bar shows the number of stores within that composite score range (1,000 total stores tested). A lower score is better.

The distribution is right‑skewed, which tells us two things:

Most of stores are doing OK but not great. The peak of the curve sits around 0.7–0.8, representing stores with LCPs around 2.6–2.8 seconds, solid INP responsiveness (≈ 200 ms), and stable layouts. These sites are usable, but not yet fast.
A smalller subset of sites are very slow. Anything above 1.5 CSW often corresponds to LCPs over 4 seconds and INP spikes above 500–700 ms. These sites may be image‑heavy, have too many scripts, or unoptimised themes.

When you look at that middle hump on the chart, the question becomes: Why are so many stores stuck there? The answer is usually trade-offs. Merchants want high-resolution imagery, pop-ups, chat widgets, and animations. All of which adds milliseconds of delay.

Without deliberate optimisation, every new app or script accumulates friction. That’s why we often say that performance isn’t only about what you add, but what you choose to leave out.

A closer look by vertical

Not all niches are created equal. A high‑end beauty brand with thousands of product variants, a loyalty program and multiple or bespoke integrations has a very different technical footprint from a small apparel boutique or an outdoors gear retailer. To see how these realities play out, our sample is grouped into seven verticals and calculated the pass rate and medians for each.

The chart below shows the share of stores in each industry that meet Google’s Core Web Vitals thresholds on mobile. Each bar represents the percentage of stores within that vertical that passed all three metrics (LCP, INP, and CLS). Outdoors and sporting goods brands lead with over 60 % passing, while beauty and fashion lag behind due to heavier visuals and more complex themes.

Core Web Vitals pass rates by vertical

Here are a few observations:

Outdoors / Sports & Gear leads the pack with a pass rate of roughly 63 %. These sites often have streamlined catalogs (tents, bikes, hiking shoes) and fewer third‑party scripts. They prioritise quick add‑to‑cart flows over flashy marketing widgets, which keeps pages lean.
Food & Beverage and Electronics both clock in around 55 %, slightly above the overall average. These categories have moderately large catalogs but benefit from standard product templates and fewer upsell pop‑ups.
General Retail, Home and Apparel sit in the middle (around 47–50 %). Apparel brands love big hero images and lookbooks, which tend to slow LCP. Home goods stores often have interactive room planners or 3D models that add weight.
Beauty is the laggard, passing CWV only about 36 % of the time. Beauty brands invest heavily in high‑resolution imagery, video tutorials, user‑generated content widgets and loyalty modules. All of that friction shows up in the numbers.

Median metrics by vertical

Pass rates tell only part of the story. To dig deeper I looked at the median LCP, INP and CLS for each vertical.

CLS values are unitless and much smaller in magnitude than LCP or INP. In the chart below, to make them visible on the same scale, CLS values are multiplied by 1000.

This chart underscores why the pass rates shake out the way they do:

Beauty’s median LCP is nearly 2.5 s, putting half of beauty sites on the wrong side of the “good” threshold. Combined with a median INP of around 180 ms and a slightly higher CLS, the average beauty store struggles to get under our composite target.
Outdoors / Sports & Gear boasts the best LCP median (about 2.1 s) and the lowest CLS. These merchants often use simpler themes and rely on high‑contrast product photos rather than large lifestyle carousels, which pays dividends in speed.
General Retail and Home have similar median LCPs around 2.25–2.3 s but Home’s INP median is slightly higher. That might be due to heavier product configurators or more complex collection pages.

At this point, you might be wondering whether there’s one metric that matters more than the others. That’s exactly what we’ll answer next.

Average composite score by vertical

Pass rates and medians provide one lens, but the weighted composite score captures the overall performance burden on a single scale. In the bar chart below you can see that Beauty has the highest average composite score (worse performance) while Outdoors/Sports & Gear has the lowest (best performance). The tighter clustering among the middle categories reflects how small improvements in LCP or INP can make a meaningful difference.

Which metric drives the composite score?

To understand which underlying metric pulls the composite score up or down, I plotted each one against our calculated composite. The chart below shows the relationship for LCP, INP, and CLS.

Visually, you can see that the lines form a rising diagonal in the LCP and INP charts but look more scattered in the CLS chart. Statistically, the correlation coefficients are 0.77 for LCP, 0.57 for INP and 0.38 for CLS. In other words:

Largest Contentful Paint has the biggest impact on the overall performance score. Almost every slow site measured had an LCP problem. Common culprits include oversized hero images, blocking JavaScript and unoptimised CSS.
Interaction to Next Paint matters, but less so. INP issues, such as long tasks, heavy JavaScript frameworks or un-debounced event handlers, often show up on dynamic pages like cart drawers or filtering interfaces. They can make a site feel sluggish even if the main content appears quickly.
Cumulative Layout Shift has the smallest correlation. Most Shopify themes have reasonable layout stability out of the box. However, CLS spikes do appear in sites with late-loading banner ads, fonts that swap in after page load or carousels that resize unpredictably.

Takeaway: If you have limited resources, start by fixing LCP. Get your hero image below 2 MB, compress and resize it properly, and use to prioritize it. Then audit your third‑party scripts and keep INP under 200 ms. Reserve time to eliminate layout shifts later, as they’re less likely to tank your score.

You can also visualize the relative impact of each metric on the composite score with the simple bar chart below. We took the correlation coefficients and plotted them side by side. This makes it easy to see that LCP matters most (with a correlation of about 0.77), followed by INP (≈ 0.57) and then CLS (≈ 0.38). The colours align with our brand palette: red for LCP, green for INP, and black for CLS.

Why speed isn’t a race

It’s tempting to treat benchmarks like these as a scoreboard: who’s “winning” the performance race? But that framing oversimplifies the problem and can mislead merchants. A lightweight Shopify site with fewer than 50 products, no chat widget, and a simple theme may sail through the Core Web Vitals tests, while a massive multibrand marketplace with thousands of SKUs, a personalized recommendation engine, and robust analytics will struggle to keep its LCP under 3 seconds. Does that mean the big brand is “losing”? Hardly.

Here are a few reasons why a high composite score doesn’t necessarily mean a brand is failing:

Catalog size and complexity. More products mean more images, more variant logic and more complex filtering. Each of these elements adds weight. Merchants with large catalogs should benchmark themselves against peers, not against minimalist one‑product stores.
Traffic volume and infrastructure. High‑traffic sites often invest in sophisticated analytics, personalisation and A/B testing tools, all of which increase JavaScript execution time. They may also deploy third‑party tag managers for advertising, which hurt INP. These tools are essential for scaling revenue; the key is to implement them thoughtfully and offload as much as possible to the edge or server side.
Business model and conversions. Subscription boxes, build‑your‑own bundles and internationalisation features often require custom scripts and dynamic pricing. A slower but feature‑rich site might convert better than a stripped‑down one. What matters is the trade‑off between speed and business value.
Development resources. Many small merchants rely on prebuilt themes and app store plugins. They may not have the budget to hire developers to optimise code or compress images. Tools like Shopify’s built‑in image optimiser help, but they don’t solve everything.

To visualize this point, the chart below contrasts a fictitious “Simple Store” with a “Complex Store.” The simple merchant sells 500 products and uses only a handful of apps. The complex merchant sells 20,000 products and integrates with a dozen third‑party systems. The composite score of the complex store is visibly larger than that of the simple store. Both could be successful businesses, but their performance profiles are inherently different.

That’s why our report doesn’t rank sites or shame the “slowest” merchants. Instead, we offer relative benchmarks and recommendations. Your goal is not to beat Patagonia or Gymshark; it’s to make your site as fast as it can be, given your unique constraints and customer expectations.

Key takeaways and recommendations

So what should you do with all this data? Below is a condensed checklist you can share with your developer, agency or internal team. It reflects the biggest performance levers we’ve observed across hundreds of audits.

Prioritize Largest Contentful Paint (LCP)

Make the real LCP fast.

Identify the real LCP element in field data, usually the hero image on the home page or a large product image.
Preload it and mark it as important
and add fetchpriority="high" on the img tag.
Serve the right size from Shopify’s image CDN
Use Liquid filters like {{ image | image_url: width: 1440 }} and output a proper srcset and sizes. Do not ship oversized 3k images to mobile.
Keep the hero simple
Slideshows, autoplay video, heavy parallax and complex banners delay LCP. A single well-sized hero image is usually faster and converts better.
Inline only critical CSS, defer the rest
Inline above-the-fold CSS for the template and load the rest with media or rel="preload" followed by rel="stylesheet" swap.

About CDNs on Shopify

All Online Store themes already use Shopify’s Fastly CDN for assets and images. They also rely on Cloudflare edge caching for some regions (tiny, but a small distinction). Therefore, you do not need to “add a CDN.”
What matters is cache-friendliness
Avoid cache-busting query strings on images, fingerprint your theme assets, keep the number of distinct asset URLs low, and avoid per-user HTML variations that break caching.

Keep Interaction to Next Paint (INP) under control

Trim main-thread work.

Break up long tasks
Anything over ~50 ms should be split. Use requestAnimationFrame and requestIdleCallback to chunk work.
Defer and move scripts out of the way
Use defer or type="module" for theme and app scripts. Keep the head clean. Load non-critical pixels after first interaction.
Use Shopify Pixels and server events where possible
Move client-side trackers to Shopify’s customer events and pixels to reduce front-end cost.
Audit apps, not just scripts
Prefer theme app extensions and app blocks over custom script tags. Remove apps that inject large bundles or duplicate features. One reviews app, one chat, one analytics where possible.
Prefer CSS over JS for UI behaviors
Sticky headers, accordions, simple animations are cheaper in CSS. When listening to scroll or resize, make listeners passive and throttled.

Minimize Cumulative Layout Shift (CLS)

Reserve space for everything that arrives late.

Always set the width and height on images and use aspect-ratio in CSS so the browser can reserve space before the image downloads.
Give dynamic UI a home
Banners for cookies, geo, discounts, and announcement bars should have reserved space from the start. Do not inject new containers above the content.
Load fonts without shifting text
Preload only the critical font files you need. Use font-display: swap. Limit weights and subsets. Consider a system font stack if the brand allows.
Product pages
Reserve space for badges, variant pickers, price blocks, and the add to cart area. Reviews should either have a fixed container or load on interaction.

Practical Shopify specifics that move the needle

Images
Use image_url with size parameters and output srcset and sizes. Serve AVIF or WebP when the platform returns them. Compress source images before upload to keep below ~500 KB for heroes and well below for others.
Theme structure
Render the hero section early in the template. Avoid placing heavy app blocks above the LCP element.
Third parties
Consolidate pixels through Shopify Pixels, remove shadow scripts, and delay chat widgets until user intent. Consider loading reviews on click or when in view.
Navigation and cart
Keep the megamenu and cart drawer light. Avoid large hydration costs. Split code if the theme uses modules.
Testing
Validate changes with field data. Check LCP and INP in CrUX and Chrome DevTools, not just Lighthouse. Profile the main thread work in the Performance panel and look for long tasks.

Before we move into what it takes to actually build a culture of performance, it’s worth hearing how Shopify frames this responsibility.

“The benchmark data from testing real Shopify stores reveals a critical insight: performance is a shared responsibility. Shopify has built mechanisms like the ‘10-point maximum impact’ rule for App Store apps and Lighthouse-based testing frameworks to maintain standards, but sustainable speed requires merchants and developers to be intentional about every asset, script, and integration they add.

The merchants who treat performance as a core feature, not an afterthought, are the ones delivering experiences that convert browsers into buyers.”

Michelle Drawert, Senior Partner Solutions Engineer at Shopify

Build a culture of performance

Beyond technical fixes, the merchants who consistently improve performance share a few mindset traits:

Performance is a feature, not an afterthought. Treat speed as part of your brand promise, just like design or product quality. Make it a KPI and review it regularly.
Benchmark yourself over time. Use tools like PageSpeed Insights and CrUX to track your metrics monthly. Look for regressions when launching new features.
Educate stakeholders. Designers, marketers and developers all influence performance. Teach them the basics of Core Web Vitals so they understand why you might say “no” to that new hero video.
Balance speed with value. Removing features for the sake of speed can hurt conversion. Always consider whether a widget or integration delivers more revenue or trust than the milliseconds it costs.

Finally, the following table summarises our recommended actions and their relative importance for improving speed. Compressing images and using a content‑delivery network top the list because they directly affect LCP. Deferring scripts and breaking up long tasks follow close behind. Reserving space for images and debouncing events are simpler but still worthwhile.

Category	Recommended Actions
Testing & Validation	Monitor changes in CrUX; test long tasks in DevTools; integrate PSI API with Shopify Analytics.
Third-Party Scripts & Apps	Limit 3rd-party apps; defer non-critical pixels; remove legacy injected scripts; consolidate analytics, chat & reviews tools.
Theme & Image Strategy	Focus on stability (Shopify already uses CDN); compress images <500KB; fingerprint JS/CSS; remove duplicate app assets.
Cumulative Layout Shift (CLS)	Design for aspect ratio; reserve space for banners & popups; use font-display: swap; fix layout for reviews & variants.
Interaction to Next Paint (INP)	Split or modularize scripts; replace client trackers with Shopify Pixels; audit apps; prefer CSS for UI; throttle listeners.
Largest Contentful Paint (LCP)	Preload hero image; use fetchpriority=’high’; optimize hero simplicity; inline critical CSS; simplify template load order.
Build a Culture of Performance	Integrate PSI into QA; set internal targets; train content teams; audit apps quarterly; assign ownership for performance.

Explore the Dataset

For transparency, I am sharing the full dataset behind this report. All of the sites we tested are publicly accessible, and the performance metrics can be recreated using free tools. The sheet is set to “view only,” so no one can accidentally change it. You can open it directly here:

Shopify_Speed_Benchmarks_Final

Feel free to make a copy of the sheet to your own Drive, filter the rows, slice the pivot tables by vertical or region, and chart your own findings. It’s the definitive source for all the numbers cited in this article. If you publish your own analysis based on this data, please credit Shero Commerce and link back to this report.

Finally, a word of caution: the workbook is extensive (took me over a week to put everything together), but the report you’re reading synthesises the key trends.

Conclusion

These results shouldn’t be read as a reflection of Shopify’s infrastructure. Shopify’s core platform consistently ranks among the fastest and most reliable in the industry, with built-in global CDN delivery, optimized image handling, and native caching at scale.

The performance differences here stem from store implementation, custom themes, third-party scripts, and app bloat, rather than Shopify itself.

In fact, if the same benchmarks were run against legacy platforms such as Magento, WooCommerce, or custom builds, overall pass rates would be significantly lower. Shopify provides one of the strongest technical foundations for achieving excellent Core Web Vitals scores when properly built and configured.

What matters most is LCP, followed by INP and then CLS. Yet, focusing solely on ranking or outscoring another merchant misses the point. Your goal should be to deliver a fast, stable, delightful experience that earns trust and drives conversion.

1. Why do some stores with similar traffic still have different performance scores?

Even when two stores have the same amount of traffic or number of apps installed, their performance can vary significantly due to differences in how themes are coded, how assets (images/videos) are managed, and exactly how third-party integrations are implemented. The benchmark data shows that how you build and maintain the site matters as much as what you build.

2. How often should I re-run a performance/speed benchmark on my Shopify store?

Benchmarks are most helpful when you track them over time. A good cadence is: - Before a major launch or redesign - After adding or upgrading key apps - Quarterly for ongoing monitoring Keeping periodic checks ensures you catch regressions early and maintain your competitive edge.

3. Are there specific Shopify apps or integrations that consistently drag performance?

Yes. While we don't call out particular apps by name, the benchmark data shows that apps which load lots of client-side JavaScript, create numerous custom fonts/icons, or inject large inline CSS tend to have the largest negative performance impact. Always audit new integrations for their asset footprint and loading behavior. You can also check reviews of said apps on the Shopify app store.

4. How do Core Web Vitals thresholds apply in a site that's constantly changing (seasonal updates, flash sales, etc.)?

Core Web Vitals are still valid in dynamic environments: - Set a baseline (e.g., from your benchmark) and aim to improve or maintain it. - Measure during peak load conditions (e.g., flash sale) and normal conditions. - If you hit spikes (e.g., theme change, heavy app update), run the benchmark test immediately afterwards and compare to baseline. Tracking via these tests lets you ensure high-impact changes don't degrade thresholds.

5. Can I run my own PSI + CrUX performance tests from the command line?

Yes. If you're technical, you can run the exact testing workflow I used in this report with a bash script. It pulls PSI (lab data) and CrUX (field data), merges the results, and saves them into a clean CSV. However, this requires command-line experience, API keys, and basic performance knowledge. If you want the script, or want help running it, reach out to me on LinkedIn or through our site and I'll send it over.

Gentian Shero

Co-founder & CSO at Shero Commerce

Gentian is the Chief Strategy Officer (CSO) and Co-founder of Shero Commerce. With over 15 years of experience in eCommerce strategy, technical SEO, and inbound marketing, he has helped hundreds of brands grow smarter and scale faster. At Shero, Gentian leads digital strategy and optimization for mid-market and enterprise merchants, combining hands-on expertise with a deep focus on ROI.